Solving a travelling salesman problem with convex optimization

10 May 2019

The travelling salesman problem is a mixed-integer programming problem where the objective is to find the best route past a selection of cities. Here we attempt to solve it in a non-traditional way: by using convex optimization.

Suppose you are given a list of cities and the travelling time between each of them. The travelling salesman problem (TSP) is to find a route that visits all the cities exactly once, and is as short as possible.

The classical problem formulation

This problem can be formulated as a mixed integer (0-1) linear programming problem, as first proposed in (Dantzig, Fulkerson, & Johnson, 1954). See also (Conforti, Cornuéjols, & Zambelli, 2014) Chapter 2.

For a TSP with $n$ cities, define the matrix $X \in \{0,1\}^{n \times n}$ . The element $x_{i,j}$ is 1 if the route leads from city $i$ to city $j$ , and 0 otherwise.

As constraints, we need that each city is visited exactly once. Regarding the matrix $X$ , this means that each row and each column of $X$ sums up to 1.

Furthermore, to ensure that each city is visited in the same trip, we add so-called ‘subtour elimination constraints’. There are exponentially many of these, but we can add them one-by-one when we detect them in an intermediate solution. For example, if a solution contains a subtour between cities 3 and 4, we can add a constraint $x_{3,4} + x_{4,3} \leq 1$ . I’ll denote these constraints as $X \in E$ , with $E$ the set that eliminates subtours. We start out with an empty set, $E = \emptyset$ .

A travelling saleman problem is also characterized by a matrix that describes the cost (or travelling time) for travel between cities. Let’s call this matrix $C \in \mathbb{R}_+^{n \times n}$ , with $c_{i,j}$ the cost incurred when travelling from city $i$ to $j$ . The total cost for a given route can be expressed as $\text{trace}(C^TX)$ , which is linear in all $x_{i,j}$ ’s. In the example we work with, I generated some random costs for travel between 7 cities, visualized here with strong links for high costs and thin links for low costs: Weigths

The ‘problem’ in ‘travelling salesman problem’ is that we would like to find the route that incurs the lowest total cost.

The full optimization problem is

$\begin{equation} \begin{aligned} & \text{minimize} && \text{trace}(C^TX) \\ & \text{subject to} && X \mathbf{1} = \mathbf{1} \\ & && X^T \mathbf{1} = \mathbf{1} \\ & && X \in E \\ & && X \in \{0,1\}^{n \times n} \end{aligned} \end{equation}$

Rewriting the optimization problem

What we’ll do now is to take a closer look at the constraint $x_{i,j} \in \{0,1\}$ . Introducing $z = 2x_{i,j} - 1$ , we can write down some equivalent constraints:

$\begin{equation} \begin{aligned} & x_{i,j} \in \{0,1\} \\ & x_{i,j} \in \mathbb{R},~ x_{i,j}(x_{i,j}-1) = 0 \\ & x_{i,j} \in \mathbb{R},~ z \in \mathbb{R}, ~ (2x_{i,j} - 1)(2x_{i,j} - 1) = z^2 = 1 \\ & z \in \mathbb{R}, ~ \text{rank} \begin{bmatrix} 1 & z \\ z & 1 \end{bmatrix} = 1 \end{aligned} \end{equation}$

The last constraint is interesting, because here $z$ (or $x_{i,j}$ ) does not appear in a product. We’re going to continue with a more general formulation of this constraint, using a parameter $s \in \mathbb{R}$ :

$\begin{equation} z \in \mathbb{R}, ~ \text{rank} \begin{bmatrix} 1 + s^2 + 2zs & z+s \\ z+s & 1 \end{bmatrix} = 1 \end{equation}$

For $s=0$ we recover the constraint we had earlier. $s$ can be fixed to anything, the equivalence still holds. The above matrix we denote with $M(x_{i,j},s_{i,j})$ for each binary variable.

Convex heuristic

Enforcing the rank constraint is in general an NP hard problem. However, we can use the nuclear norm as a convex heuristic. The nuclear norm of a matrix is the sum of its singular values.

What we do, is drop the rank constraints, and add for each binary variable the nuclear norm of the associated affine matrix, $\|M(x_{i,j},s_{i,j})\|_*$ , to the objective function.

This means we obtain the convex optimization problem

$\begin{equation} \begin{aligned} & \text{minimize} && \text{trace}(C^TX) + \lambda \sum_{i = 1}^{n}\sum_{j = 1}^{n} \|M(x_{i,j},s_{i,j})\|_* \\ & \text{subject to} && X \mathbf{1} = \mathbf{1} \\ & && X^T \mathbf{1} = \mathbf{1} \\ & && X \in E \\ \end{aligned} \end{equation}$

where $\lambda > 0$ is a weighting parameter.

We might not obtain the optimal solution from solving this optimization problem. In fact, we might not obtain a valid solution at all if the solution for the continuous variable $x_{i,j}$ does not equal zero or one.

If this happens, what I propose is to solve the problem again, but now with $s_{i,j}$ parameterized with $s_{i,j} = 1-2x_{i,j}^*$ , where $x_{i,j}^*$ denotes the optimal value of the variable we found solving the convex optimization problem.

Numerical example

Given the example above, I implemented the procedure as proposed above. Initializing $s_{i,j}$ with values drawn from a uniform distribution between -1 and 1, we obtain the following solution after 2 iterations:

Route with 0 subtours eliminated

The optimal route is coloured yellow, the solution (with all $x_{i,j}$ converged to either 0 or 1), is coloured blue. The optimal route was found by enumerating all routes and checking the total travel time. There is a subtour between 4 and 6, 1 and 5, and 2,3 and 7. We eliminate these subtours by adding constraints to the convex problem and repeat the procedure, again needing 2 iterations for convergence.

Route with 3 subtours eliminated

There are two subtours left: between 2 and 7, and between the rest of the cities. Adding again subtour constraints, we obtain, again after 2 iterations,

Route with 4 subtours eliminated

We found the optimal route by solving a total of 6 convex optimization problems.

Discussion and conclusion

Here I’ve shown how one can attempt to solve a Travelling Salesman Problem with convex optimization, but without using a classic branch and bound(/cut) technique.

There are a couple of downsides to the proposed approach. To name a few:

If a valid tour is found, there is no guarantee it is the optimal tour. However, the solution might be a good initialization for a true mixed-integer solver.
If one solves the convex optimization problem repeatedly, I cannot guarantee it converges to a valid solution of the mixed-integer problem. In the example only two were needed, but this need not be the case for other problems.
The variable $\lambda$ needs to be tuned. I’ve had luck with tuning it as low as possible.
Using nuclear norms results in a Semidefinite Programming problem (SDP). For each binary variable one needs an extra term for the objective function, and the number of variables increases quadratically with the number of cities. This method is therefore computationally quite expensive.

The interesting question that remains is of course how the values for $s_{i,j}$ and $\lambda$ relate to the success rate and the optimal solution of the mixed-integer optimization problem. This is something I’ll leave for future study.

Bibliography

Dantzig, G., Fulkerson, R., & Johnson, S. (1954). Solution of a large-scale traveling-salesman problem. Journal of the Operations Research Society of America, 2(4), 393–410.
Conforti, M., Cornuéjols, G., & Zambelli, G. (2014). Integer programming (Vol. 271). Springer.