Solving a sudoku with convex optimization

16 May 2019

A sudoku puzzle is a nice example of integer programming where the goal is to find a solution that satisfies the rules of the game. We’re going to try and solve sudokus using convex optimization.

Here’s an example of a sudoku puzzle:

Unsolved sudoku

The rules are quite simple:

Use the numbers 1 to 9 to fill in the empty squares,
Every row must contain each number once,
Every column must contain each number once,
Every 3x3 square as indicated by the thick black lines must contain each number once.

In this post we first write the sudoku as an integer optimization problem. We then propose a convex relaxation and test it on a sudoku.

The integer programming formulation

We will reformulate this optimization problem using variables $x_{ijk} \in \{0,1\},~ 1 \leq i,j,k \leq 9$ , following (Conforti, Cornuéjols, & Zambelli, 2014).

We use the indices $i,j$ for the position $i,j$ in the grid and the index $k$ to indicate which integer we will fill in into the sudoku.

The rules are then as follows:

Every row must contain the integers 1 to 9:

$\sum_j x_{ijk} = 1 \quad \text{for} \quad 1 \leq i,k \leq 9$

Every column must contain the integers 1 to 9:

$\sum_i x_{ijk} = 1 \quad \text{for} \quad 1 \leq j,k \leq 9$

Every 3x3 square as indicated by the thick black lines must contain each number once:
$\sum_{q = 0}^{2} \sum_{r=0}^2 x_{i+q,j+r,k} = 1 \quad \text{for} \quad i,j \in \{1,4,7\},~ 1 \leq k \leq 9$
And in the $k$ -direction only one $x_{ijk}$ should be 1,

$\sum_k x_{ijk} = 1 \quad \text{for} \quad 1 \leq i,j \leq 9$

Additionally we are given some pre-filled squares, for which $x_{ijk} = 1$ if square $(i,j)$ is filled with the number $k$ . We abbreviate this with the notation $x_{ijk} \in E$ , Where the set $E$ specifies the which squares are filled in, and with which number.

From this analysis we now have a $\{0,1\}$ -integer programming problem with affine constraints.

A number of methods have been developed to solve these sudokus, see for example (Chi & Lange, 2012). Even convex methods based on the convex relaxation of the sparsity of $x$ have been proposed, (Babu, Pelckmans, Stoica, & Li, 2010).

We’re trying something different here, namely a convex method based on the relaxation of the integer variable $x_{ijk}$ , (Doelman & Verhaegen, 2016).

Convex optimization for sudokus

The approach is similar to the one used in a previous post on the Travelling Salesman Problem, that also is a $\{0,1\}$ -Integer Programming problem.

Let’s take a closer look at the constraint $x_{ijk} \in \{0,1\}$ . Introducing a simple substitution $z = 2x_{ijk} - 1$ , we can write down some equivalent constraints:

$\begin{equation} \begin{aligned} & x_{ijk} \in \{0,1\} \\ & x_{ijk} \in \mathbb{R},~ x_{ijk}(x_{ijk}-1) = 0 \\ & x_{ijk} \in \mathbb{R},~ z \in \mathbb{R}, ~ (2x_{ijk} - 1)(2x_{ijk} - 1) = z^2 = 1 \\ & z \in \mathbb{R}, ~ \text{rank} \begin{bmatrix} 1 & z \\ z & 1 \end{bmatrix} = 1 \end{aligned} \end{equation}$

In the last constraint $z$ (or $x_{ijk}$ ) does not appear in a product. We continue with a more general formulation of this constraint, using a (fixed) parameter $s \in \mathbb{R}$ :

$\begin{equation} z \in \mathbb{R}, ~ \text{rank} \begin{bmatrix} 1 + s^2 + 2zs & z+s \\ z+s & 1 \end{bmatrix} = 1 \end{equation}$

For $s=0$ we recover the constraint we had earlier. $s$ can be fixed to anything (it is not part of the optimization), the equivalence still holds. The above matrix in (2) in the argument of the rank function, we denote with $M(x_{ijk},s_{ijk})$ for each binary variable.

The convex heuristic we use in an attempt to solve the sudoku is dropping the rank constraint, but adding for each binary variable the nuclear norm (sum of the singular values) of the matrix $M$ , denoted as $\|M(x_{ijk},s_{ijk})\|_*$ to the objective function.

The convex optimization problem we now solve is the following:

$\begin{equation} \begin{aligned} & \text{minimize} && \sum_{i,j,k} \|M(x_{ijk},s_{ijk})\|_* \\ & \text{subject to} && \sum_j x_{ijk} = 1 \quad \text{for} \quad 1 \leq i,k \leq 9 \\ & && \sum_i x_{ijk} = 1 \quad \text{for} \quad 1 \leq j,k \leq 9 \\ & && \sum_{q = 0}^{2} \sum_{r=0}^2 x_{i+q,j+r,k} = 1 \quad \text{for} \quad i,j \in \{1,4,7\},~ 1 \leq k \leq 9 \\ & && \sum_k x_{ijk} = 1 \quad \text{for} \quad 1 \leq i,j \leq 9 \\ & && x_{ijk} \in E \end{aligned} \end{equation}$

If we do not get a valid solution (any $x_{ijk} \notin \{0,1\}$ ) we iterate by using new parameters $s_{ijk}^+ = 1-2x_{i,j}^*$ , where $x_{i,j}^*$ denotes the optimal value of the variable we found solving the convex optimization problem.

Examples

The approach as proposed above works for the sudoku shown earlier, converging to the correct solution (with integer variables) after two iterations. The solution is as follows:

Correct sudoku solve

There are sudokus that the approach above doesn’t solve as easily (well, not at all really for the initializations of $s_{ijk}$ that I tried) as the first example. Here’s the result for a different sudoku with far fewer filled-in squares:

Incorrect sudoku solve

The result isn’t integer at all! Not only that, due to the $x_{ijk}$ not converging to 0 or 1, we have repeated entries in columns, rows and blocks, even though the affine constraints have been handled by the convex solver. This means we have something left to research: what makes a sudoku “solvable” in this way, and to what extend does it depend on (a random initialization of) the parameters $s_{ijk}$ .

Conclusion

We’ve rewritten the sudoku as an integer programming problem, and then as a $\{0,1\}$ -integer feasibility problem. We used a convex heuristic to find integer solutions, that turned out to work well for some cases, and not so well for others. How exactly the success rate depends on the properties of the sudoku is something we still have to research.

Bibliography

Conforti, M., Cornuéjols, G., & Zambelli, G. (2014). Integer programming (Vol. 271). Springer.
Chi, E. C., & Lange, K. (2012). Techniques for solving sudoku puzzles. ArXiv Preprint ArXiv:1203.2295.
Babu, P., Pelckmans, K., Stoica, P., & Li, J. (2010). Linear systems, sparse solutions, and Sudoku. IEEE Signal Processing Letters, 17(1), 40–42.
Doelman, R., & Verhaegen, M. (2016). Sequential convex relaxation for convex optimization with bilinear matrix equalities. In 2016 European Control Conference (ECC) (pp. 1946–1951). IEEE.