There are constraints all around us. They are a natural medium to express problems in many fields. A paradigm to solve these problems is called constraint programming. Its idea is to search for a state in which defined constraints are satisfied at the same time. Currently, we see two branches of constraint programming, namely constraint satisfaction and constraint solving. They both describe the problem as a set of constraints and solve them. However, the former deals only with problems defined over finite domains. In this article I’m going to solve sudoku puzzle using its elementary method called constraint propagation.
A constraint satisfaction problem is defined as a set of variables with a finite set of possible values (domain) and a set of constraints restricting the values that variables can simultaneously take. Sudoku puzzle is a perfect example of the problem that can be modeled as a constraint satisfaction problem. It is a well-known number placement puzzle: for a partially filled 9×9 grid, the goal is to place numbers 1 to 9 to each square in such a way that in each row, column, and 3×3 sub-grid, each number occurs exactly once.
Thus according to our definition, each square corresponds to one variable, the domain of each variable is a set of numbers from 1 to 9 and the constraints express non-equality of variables in the same row, column or sub-grid. A solution is an assignment of a value from its domain to every variable, in such a way that all constraints are satisfied at once. Most algorithms are based on a search through the possible assignments of values to all variables.
The backtracking search is the simplest algorithm to solve constraints problem. It is a recursive algorithm which incrementally extends partial assignment of the variables. At each step, it picks an unassigned variable and assigns value consistent with the constraints. Once a value has been found, it picks another variable and repeats this process. If a variable has no possible values, then it backtracks to the previously assigned variable and changes its value.
The backtracking can almost solve any problems due to its brute-force nature. It is guaranteed to find a solution because it will eventually try every possible configuration. The disadvantage of this method is that the solving time may be slow. It is not able to detect a conflict of variables before it really occurs.
Backtracking progressively instantiates variables. The order in which the variables are instantiated can greatly affect the performance of the algorithm. The minimum remaining values heuristic picks the variable with the fewest remaining values in its domain to assign next. Thus it allows discovering a dead end sooner. If there is a variable without available values, this heuristic will select this variable as first and avoid pointless search through other variables. It also has been called the most constrained variable or fail-first heuristic. As for the sudoku puzzle, basically instead of choosing the first empty square, the square with the least possible values is chosen.
It has a tremendous impact on performance. The same sudoku puzzle can be solved even 300 times faster than using simple backtracking. Another possible method is a degree heuristic, which chooses the unassigned variable that is involved in the most constraints with other unassigned variables.
Another improvement for reducing the search space is a method called constraint propagation. The key thing is to maintain a list of possible values each variable can have. After each value is assigned by search algorithm, constraint propagation updates possible values of every other variable. It assigns values to all variables with only one possible value. If any of the variables has no possible values, it backtracks to the previous variable assigned by search. A number of search branches is greatly limited. Here’s a visualization of solving sudoku using constraint propagation.
The implementation is quite different and more complex than simple backtracking. Each square has to store its possible values instead of just current one. The easiest way is to use a binary number as a bitmask. A square with possible values 9 or 7 would be represented as 10100000. Empty squares have a binary number equal 111111111. Thus instead of assigning values to squares, the algorithm eliminates all values except chosen one.
Then it performs two more operations based on square’s units and peers. A unit is a collection of squares where every value has to appear exactly once. Every square has exactly three units: a row, column and box. Peers are squares that belong to the same units. If a square is reduced to one value, then the algorithm eliminates that value from its peers. Moreover if a unit is reduced to only one place for a value, then it puts the value there immediately.
Finally, constraint propagation uses backtracking search with minimum remaining values heuristic. The solution is found when every square has only one value.
I’m extremely pleased with the results. The constraint propagation is a powerful concept with simple but effective implementation. As usual, you can find the solution at my gist.