Traveling Salesman Problem: Branch and Bound Solution

Branch And Bound Solution to TSP in Python

Okan Yenigün
Towards Dev

--

Photo by Marten Bjork on Unsplash

The traveling salesman problem is a well-known problem in mathematics and optimization. A salesperson must visit several different cities and return to the starting point.

The problem involves determining the sequence in which the cities should be visited by a salesperson so that the resulting trip covers the shortest possible distance and each city is visited exactly once.

Solution of a traveling salesman problem: the black line shows the shortest possible loop that connects every red dot. Source: Wikipedia

It applies to a lot of other interesting problems with real-world impacts, such as delivery routes, tour planning, circuit board design, etc. For example, a delivery company needs to determine the most efficient route for delivering packages to multiple destinations, with the goal of minimizing the distance traveled to save cost and fuel.

Unfortunately, finding a solution to this problem is not always straightforward. It is considered an NP-Hard problem. It means that finding an exact solution for large instances becomes computationally infeasible.

Source

NP stands for non-deterministic polynomial time, which includes problems that can be solved in polynomial time by a non-deterministic Turing machine. However, the class NP-hard includes problems that may not be in NP, but are at least as hard as the hardest problems in NP. In other words, if an NP-hard problem can be solved in polynomial time, then all problems in NP can also be solved in polynomial time.

Travelling Salesman Problem becomes increasingly difficult as the size of the problem increases due to the combinatorial explosion of possible solutions. If there are n cities to visit, the number of possible routes is n! then, imagine that you have 50 cities, then it would take decades to calculate all possible solutions.

There are several ways to solve this problem.

  • Exact Algorithms: These algorithms guarantee the optimal solution, but they are often slow and not practical for large instances of TSP.
  • Heuristic algorithms: These algorithms do not guarantee the optimal solution, but can provide a good solution in a reasonable amount of time.
  • Metaheuristic algorithms: These algorithms are more general than heuristic algorithms and are often used to solve large instances of TSP.
  • Approximation algorithms: These algorithms provide a solution that is guaranteed to be within a certain percentage of the optimal solution.

In this blog post, we will examine one of the Exact Algorithms, the Branch and Bound method. The other well-known method of Exact Algorithms is called Dynamic Programming which will be mentioned in another post.

Branch and Bound

Although it sounds like a western movie title, Branch and Bound is a problem-solving strategy. It works for optimization and minimization problems. It uses a state-space tree for solving any problem. The algorithm works by recursively partitioning the search space into smaller subsets and pruning the subsets that cannot contain an optimal solution.

State Space Tree

A tree of state space is a representation of every conceivable state of a problem, starting from the root node as an initial state and ending at the leaf node as a final state.

State Space Tree. Source

Let’s assume that we have four items and a sequencing problem: {C1, C2, C3, C4}, and my solution is a subset of this: {C1, C2}.

Let’s draw the state space tree for this problem. We follow a breadth-first search when we are exploring the solutions in the Branch and Bound method.

In the beginning, we consider all options.

Image by the author.

After the first level is completed, we start again from node 2. We can visit C2, C3, or C4 in node 2. At this level, we will have no other option in node 8, because, we have nowhere to go in that node.

State space tree for the solution. Image by the author.

The above way of doing B&B is called FIFO Branch and Bound because we use Queue for this implementation. There is also another solution that uses Stacks, it is called LIFO Branch and Bound.

Ok, what if we have also a cost for each path? Then the solution is called the Least Cost Branch and Bound (LC-BB).

This time, we calculate the cost (whatever the cost function is for the problem) for every path of visit.

Ok, let’s assume that the costs for the first level are like the below.

Image by the author.

We should explore the node with the least cost (node 3).

Image by the author.

We reached the solution at node 7.

Traveling Salesman Problem Solution

Imagine we have 4 cities and the salesman starts from one.

Image by the author.

The state space tree of the problem would be like below. Remember that in Branch and Bound, we generate nodes in level order. While generating nodes, we calculate the costs. And if the cost is greater than some bounds, we kill the node. Thus, we aren’t generating all nodes, we only generate promising nodes.

State Space Tree. Image by the author.

Ok. Let’s define a new problem. We have 5 cities to visit. And on each path, the cost is displayed. And as a constraint, it is stated that if we choose any two intermediate vertices (b and c), only b can precede c.

The problem. Image by the example.

Now, we need to calculate lower bounds. For each city i, 1 ≤ i ≤ n, we will find the sum s_i of the distances from city i to the two nearest cities; and then we will compute the sum s of these n numbers. After, we will divide the results by 2, and, round up the result to the nearest integer.

lb = [s/2]

For example, if we start from a, c, and b have the lowest cost. Add the costs: 1+3. For b, the two nearest cities are a and c: Add 3+6, and so on…

lb = [(1+3) + (3+6) + (1+2) + (3+4) + (2+3)] / 2= 14

The state space tree of the problem will be as follows. Now let’s take a closer look at the solution.

State space tree. Image by the author.

The root node is a and we have already calculated the lower bound there as 14. From a, we can go to b, or d. We cannot go to c as indicated as a constraint previously in the problem (c cannot precede b).

lb (a,d) = [(1+5) + (3+6) + (1+2) + (3+5) + (2+3)] /2 = 16

We have to select a to d, so, the first term is (1+5). Remember we select the smallest two adjacent values, but one of them must be 5 since we select a to d.

lb (a,e) = [(1+8) + (3+6)+(1+2)+(3+4)+(8+2)] /2 = 19

After the first level, we choose the lowest cost (node 1) and branch out from there.

lb (a,b,c) = [(1+3)+(3+6)+(1+6)+(3+4)+(2+3)]/2 = 16

lb (a,b,d) = [(1+3)+(3+7)+(1+2)+(3+7)+(2+3)]/2 =16

lb (a,b,e) = [(1+3)+(3+9)+(1+2)+(3+4)+(2+9)]/2=19

We can continue with both 5 and 6. In the last level, we calculate the length of each tour.

length (node 8) = 3 + 6 + 4 + 3 + 8 = 24

length (node 11) = 3 + 7 + 3 + 2 + 1 = 16

When we compare all these lengths, we have an optimal solution at node 11. The path a -> b -> d -> e -> c -> a is the optimal solution.

Python Example

This was a class book explanation. Let’s implement it in Python.

Let’s solve this problem for the biggest 5 cities in Germany: Berlin, Hamburg, Munich, Cologne, and Frankfurt.

First, let’s calculate the distances:

import numpy as np
from math import radians, sin, cos, sqrt, atan2
from typing import Dict

def calculate_distances_btw_two_city(coord1: Dict[str,float], coord2: Dict[str, float]) -> float:
"""calculate the distance between two cities"""
R = 6371.0 # Earth radius in kilometers
dlat = radians(coord2["lat"] - coord1["lat"])
dlon = radians(coord2["lon"] - coord1["lon"])
a = sin(dlat/2)**2 + cos(radians(coord1["lat"])) * cos(radians(coord2["lat"])) * sin(dlon/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c

# cities, latitudes and longitudes
berlin = {"lat":52.5200, "lon": 13.4050}
hamburg = {"lat":53.5511, "lon": 9.9937}
munich = {"lat":48.1351, "lon": 11.5820}
cologne = {"lat":50.9375, "lon": 6.9603}
frankfurt = {"lat":50.1109, "lon": 8.6821}

cities = [berlin, hamburg, munich, cologne, frankfurt]
# an empty array
distances = np.zeros((len(cities), len(cities)))
# calculate each distance between two cities
for i in range(len(cities)):
for j in range(len(cities)):
distances[i, j] = calculate_distances_btw_two_city(cities[i], cities[j])
print(distances)
"""
[[ 0. 255.25016603 504.4153314 477.25992139 423.52802494]
[255.25016603 0. 612.42808114 356.45765396 392.98854762]
[504.4153314 612.42808114 0. 456.27053641 304.58511831]
[477.25992139 356.45765396 456.27053641 0. 152.51688396]
[423.52802494 392.98854762 304.58511831 152.51688396 0. ]]

"""

Now let’s write a function that finds Branch and Bound solution.

import numpy as np

def tsp_branch_bound(distances):
n = distances.shape[0]
best_path = None
best_cost = np.inf
# initialize stack with root node
stack = [(0, [0], set([0]), 0)]
while stack:
node = stack.pop()
if len(node[1]) == n:
# if all cities have been visited, update best path if new path is shorter
cost = node[3] + distances[node[1][-1], 0]
if cost < best_cost:
best_path = node[1]
best_cost = cost
else:
# generate children nodes by considering all unvisited cities
unvisited = set(range(n)) - node[2]
for i in unvisited:
child_path = node[1] + [i]
child_cost = node[3] + distances[node[1][-1], i]
# prune nodes with cost higher than current best cost
if child_cost < best_cost:
stack.append((i, child_path, node[2] | set([i]), child_cost))
return best_path, best_cost

And the result is:

best_path, best_cost = tsp_branch_bound(distances)

# Print the results
print("Best path:", [f"City {i+1}" for i in best_path])
print("Best cost:", round(best_cost, 2), "km")

"""
Best path: ['City 1', 'City 3', 'City 5', 'City 4', 'City 2']
Best cost: 1573.23 km
"""

Let’s display the path on the map:

import folium

# define the latitude and longitude of each city
berlin = (52.5200, 13.4050)
hamburg = (53.5511, 9.9937)
munich = (48.1351, 11.5820)
cologne = (50.9375, 6.9603)
frankfurt = (50.1109, 8.6821)

# create a map centered on Germany
m = folium.Map(location=(51.1657, 10.4515), zoom_start=6)

# add markers for each city
for city, location in zip(['Berlin', 'Hamburg', 'Munich', 'Cologne', 'Frankfurt'], [berlin, hamburg, munich, cologne, frankfurt]):
folium.Marker(location, popup=city).add_to(m)

# add a polyline for the optimal path
path = [berlin, frankfurt, cologne, munich, hamburg, berlin] # ordered list of city locations
folium.PolyLine(path, color='red', weight=3).add_to(m)

# display the map
m

Conclusion

The branch and bound algorithm is particularly useful when the number of cities to be visited is small to moderate in size, and an exact solution is required. In other cases, heuristics or approximation algorithms may be more appropriate. Overall, the branch and bound algorithm is an important technique for solving the traveling salesman problem, and its implementation in Python can be applied to many different scenarios where optimal route planning is required.

Read More

Sources

https://en.wikipedia.org/wiki/Travelling_salesman_problem

https://en.wikipedia.org/wiki/NP-hardness

http://naveenkandwal.blogspot.com/2015/01/p-np-np-complete-np-hard.html

https://www.youtube.com/watch?v=aaaVm6uUY5A

https://www.youtube.com/watch?v=1FEP_sNb62k&t=679s

https://www.youtube.com/watch?v=3RBNPc0_Q6g

--

--