The Travelling Salesman Problem






Introduction

A travelling salesman leaves home to visit some customers. He wants to travel as little as possible, but with the condition that he returns home at the end. He would also prefer not to visit any location twice.

(1) Can the salesman visit every customer?
(2) What is the shortest possible route?
(3) Can (1) be solved if no customer is visited twice?

We draw a graph with customers as vertices and routes between the customers as edges, where the weight on each edge is the physical distance between the customers which are represented by the endpoints of the edges. If there is no route, then we omit the edge. There is a solution to (1) if the graph we obtain is connected, and (3) is possible if and only if there is a Hamilton cycle in the graph.

This can be generalized so that the weights on the edges are the costs of travelling between places, or the times taken. We will talk in terms of cheapest routes and the weight of a solution. These can be thought of as minimum distance and distance, respectively.

What is the cheapest possible route?

Unfortunately there is no efficient algorithm known for solving (2), so a trial and error method must be used, that is generate all possible routes, compute their weight and check to see if we have a best possible solution. See http://cs.colgate.edu/faculty/stina/courses/cosc/102/f03/labs/project/proj-tsp.html for an example of one such method - branch and bound.
Suppose we allow the salesman to visit the same place more than once, that is, we look for a shortest closed walk. Then an upper bound for the problem is twice the weight of a minimum spanning tree in the graph. We show in the following example that this bound is tight, that is, that sometimes the best route is twice the weight of a minimum spanning tree.

Example - shortest route is not a hamilton cycle

This graph does not satisfy the triangle inequality (see below) as for instance
w(ab)+w(be) < w(ae)
mst tsp A minimum spanning tree in this graph is
best tsp An upper bound for the problem is 2×(weight of a minimum spanning tree) =2 × 4 = 8.
The minimum weight of a hamilton cycle is 104 (abedca). The best route is actually given by traversing each edge of the minimum spanning tree twice.


Can we ever know the solution is a hamilton cycle?

Suppose we want to find a solution when the graph is complete and satisfies the triangle inequality, that is for all vertices x,y,z, the following inequality holds w(xy)+w(yz) ge w(xz), where w(ab) is the weight of the edge ab. Then a shortest route around all the customers is a Hamiltonian cycle.

Idea of the Proof

hamilton graph proof 1 hamilton graph proof 2
Let C= be a shortest closed walk in a graph satisfying the triangle inequality. Suppose C has a repeated vetex v. We can write C as: vx1x2...xkvy1...ylv. By the triangle inequality, replacing xkvy1 by xky1 gives a closed walk which is at most the same length as C, and has one less repeated vertex. Continue this replacement process with each repeated vertex until all the repeated vertices have been eliminated. Then we have a Hamiltonian cycle with at most the same length as C.

An Approximation Algorithm

Let G be a weighted complete graph satisying the triangle inequality.

1. Start with any vertex v1. Call this path P1.

2. Given the path Pk=x1x2...xk, examine the graph to see if there are any remaining unused vertices.
If there are not, then P=x1x2...xkx1 is the Nearest Neighbour approximation.
If there are vertices remaining, let xk+1 be a closest vertex to xk (if there is more than one, choose arbitarily), and set Pk+1=x1x2...xkxk+1. Continue this step until the Nearest Neighbour approximation has been constructed.

Example

tsp example Start at a, add b, path ab, length 3
from b, add e, path abe, length 7
from e, add d, path abed, length 13
from d, add c, path abedc, length 17.
The Nearest Neighbour Approximation beginning at a is abedca which has weight 23. What is the length obtained if we begin at d?


Bounds on the Problem

How good is our approximate solution? Can we find two values so that the solution to the travelling salesman problem lies between them? It is obvious that we can find some values - 0 for a minimum and the sum of all the values for a maximum. These are called a lower and upper bound respectively. We want to find lower and upper bounds which are as close together as possible!

Upper Bounds

We said earlier that the solution is no longer than twice the length of a minimum spanning tree. This is an upper bound.

In the example above, a minimum spanning tree has weight 16 and so twice this is 32, and so 32 is an upper bound. However, we have found a route around the graph which has weight just 23 which is obviously better than 32. Now 23 is at least the length of the solution (it cant be shorter than the shortest, but it might be longer!) so this is a better upper bound than 32 (as it is smaller - we need to make the upper bound as small as possible and the lower bound as large as possible.)

Lower Bounds

Consider a circuit C which is a solution to the problem. Remove one vertex v from G and C which is not repeated in C, and the edge(s) which is/are incident with it in C. Then Now, the remainder of C is a walk which contains all the other vertices. This walk must be at least the length of a minimum spanning tree in G-{v}, as that is the cheapest way of connecting the vertices in G-{v}. If we removed one edge from C then it is of weight at least the weight of the cheapest edge e at v, so C had weight at least 2×(weight of e) + weight of a minimum spanning tree in G-v. If we removed two edges e and e' then the weight of C is at least the weight of e plus the weight of e' plus the weight of a minimum spanning tree in G-v. If G satisfies the triangle inequality, we will remove two edges and so be in the second case.

In the example above, suppose we remove a. A minimum spanning tree in the remaining graph is of weight 13. The cheapest two edges at a are of weight 3 and 5. This means a lower bound is 13+5+3=21. It might be possible to get this, but then again it might not, however our answer of 23 isn't that far off, so maybe we don't care is 23 is right or not, maybe it is good enough?

Complexity

We said earlier that there is no quick algorithm for finding a solution to the travelling salesman problem. This problem is NP-complete which means (unless P=NP) that no polynomial time algorithm exists to solve it. In the worst case, how many checks need to be done in order to make sure we have the cheapest route? We need to check every possible ordering of the vertices - if there are n vertices then there are (n-1)! orderings of the vertices. This gets very big very fast. For just n=21, there are n!=2432902008176640000 possibilities to check.

In Traveling Salesman, Go Home!, by John Svetlik, the following appears.

"Getting a precise answer to this optimization problem is not easy. Currently, the world's record for the Traveling Salesperson Problem is just over 3,000 cities. That solution required the creative energies of four top researchers and the dedicated use of almost one month's time on a supercomputer.

"In sharp contrast, you can get a pretty good approximation to the optimal solution in a very short time. This is a common theme of many discrete math problems. "


In consideration of the difficulty of this problem, a balance must be met. Is a possible error of 9% going to cost more or less than the extra time and resources needed to make sure an accurate answer is obtained?

Again from "Traveling Salesman, Go Home!"
"If you can come within a reasonable degree of the right answer, in a reasonable amount of time, and at reasonable cost, you may consider that you've solved the problem-at least in the real world."

HAL KIERSTEAD


Exercises

(i) Find a minimum spanning tree in each of the following graphs.
(ii) Give an upper and a lower bound for the travelling salesman problem in each of the following graphs.
(iii) Find a Nearest Neighbour Approximation beginning at each vertex.
(iv) Can you say that any of your solutions is a cheapest route?
1tsp example 2tsp example

Sarah Norell
c/o Pia Heidtmann
© MITTUNIVERSITETET
Institutionen för Teknik, Fysik och Matematik
Mittuniversitetet
SE-851 70 SUNDSVALL
Sweden
Updated 040514.