来源:F.Rossi, P.Van Beek, T. Walsh. Handbook of Constraints Programming. Elsevier, 2006.
There are three main algorithmic techniques for solving constraint satisfaction problems: backtracking search, local search, and dynamic programming. In this chapter, I survey backtracking search algorithms. Algorithms based on dynamic programming [15]—sometimes referred to in the literature as variable elimination, synthesis, or inference algorithms—are the topic of Chapter 7. Local or stochastic search algorithms are the topic of Chapter 5.
An algorithm for solving a constraint satisfaction problem(CSP) can be either complete or incomplete. Complete, or systematic algorithms, come with a guarantee that a solution will be found if one exists, and can be used to show that a CSP does not have a solution and to find a provably optimal solution. Backtracking search algorithms and dynamic programming algorithms are, in general, examples of complete algorithms. Incomplete, or non-systematic algorithms, cannot be used to show a CSP does not have a solution or to find a provably optimal solution. However, such algorithms are often effective at finding a solution if one exists and can be used to find an approximation to an optimal solution. Local or stochastic search algorithms are examples of incomplete algorithms.
Of the two classes of algorithms that are complete—backtracking search and dynamic programming—backtracking search algorithms are currently the most important in practice. The drawbacks of dynamic programming approaches are that they often require an exponential amount of time and space, and they do unnecessary work by finding, or making it possible to easily generate, all solutions to a CSP. However, one rarely wishes to find all solutions to a CSP in practice. In contrast, backtracking search algorithms work on only one solution at a time and thus need only a polynomial amount of space.
Since the first formal statements of backtracking algorithms over 40 years ago [30, 57], many techniques for improving the efficiency of a backtracking search algorithm have been suggested and evaluated. In this chapter, I survey some of the most important techniques including branching strategies, constraint propagation, nogood recording, backjumping, heuristics for variable and value ordering, randomization and restart strategies, and alternatives to depth-first search. The techniques are not always orthogonal and sometimes combining two or more techniques into one algorithm has a multiplicative effect (such as combining restarts with nogood recording) and sometimes it has a degradation effect (such as increased constraint propagation versus backjumping). Given the many possible ways that these techniques can be combined together into one algorithm, I also survey work on comparing backtracking algorithms. The best combinations of these techniques result in robust backtracking algorithms that can now routinely solve large, hard instances that are of practical importance.
4.1 Preliminaries
In this section, I first define the constraint satisfaction problem followed by a brief review of the needed background on backtracking search.
Definition 4.1 (CSP). A constraint satisfaction problem (CSP) consists of a set of variables, X = {x1, . . . , xn}; a set of values, D = {a1, . . . , ad}, where each variable xi ∈ X has an associated finite domain dom(xi) ⊆ D of possible values; and a collection of constraints.
Each constraint C is a relation—a set of tuples—over some set of variables, denoted by vars(C). The size of the set vars(C) is called the arity of the constraint. A unary constraint is a constraint of arity one, a binary constraint is a constraint of arity two, a non-binary constraint is a constraint of arity greater than two, and a global constraint is a constraint that can be over arbitrary subsets of the variables. A constraint can be specified intensionally by specifying a formula that tuples in the constraint must satisfy, or extensionally by explicitly listing the tuples in the constraint. A solution to a CSP is an assignment of a value to each variable that satisfies all the constraints. If no solution exists, the CSP is said to be inconsistent or unsatisfiable.
As a running example in this survey, I will use the 6-queens problem: how can we place 6 queens on a 6 × 6 chess board so that no two queens attack each other. As one possible CSP model, let there be a variable for each column of the board {x1, . . . , x6}, each with domain dom(xi) = {1, . . . , 6}. Assigning a value j to a variable xi means placing a queen in row j, column i. Between each pair of variables xi and xj, 1≤i < j≤ 6, there is a constraint C(xi, xj), given by (xi ≠ xj) ∧ (|i − j| ≠ |xi − xj |). One possible solution is given by {x1 = 4, x2 = 1, x3 = 5, x4 = 2, x5 = 6, x6 = 3}.
The satisfiability problem (SAT) is a CSP where the domains of the variables are the Boolean values and the constraints are Boolean formulas. I will assume that the constraints are in conjunctive normal form and are thus written as clauses. A literal is a Boolean variable or its negation and a clause is a disjunction of literals. For example, the formula ¬x1∨x2∨x3 is a clause. A clause with one literal is called a unit clause; a clause with no literals is called the empty clause. The empty clause is unsatisfiable.
A backtracking search for a solution to a CSP can be seen as performing a depth-first traversal of a search tree. The search tree is generated as the search progresses and represents alternative choices that may have to be examined in order to find a solution. The method of extending a node in the search tree is often called a branching strategy, and several alternatives have been proposed and examined in the literature (see Section 4.2). A backtracking algorithm visits a node if, at some point in the algorithm’s execution, the node is generated. Constraints are used to check whether a node may possibly lead to a solution of the CSP and to prune subtrees containing no solutions. A node in the search tree is a dead end if it does not lead to a solution.
The naive backtracking algorithm (BT) is the starting point for all of the more sophisticated backtracking algorithms (see Table 4.1). In the BT search tree, the root node at level 0 is the empty set of assignments and a node at level j is a set of assignments {x1 = a1, . . . , xj = aj}. At each node in the search tree, an uninstantiated variable is selected and the branches out of this node consist of all possible ways of extending the node by instantiating the variable with a value from its domain. The branches represent the different choices that can be made for that variable. In BT, only constraints with no uninstantiated variables are checked at a node. If a constraint check fails—a constraint is not satisfied—the next domain value of the current variable is tried. If there are no more domain values left, BT backtracks to the most recently instantiated variable. A solution is found if all constraint checks succeed after the last variable has been instantiated.
Figure 4.1 shows a fragment of the backtrack tree generated by the naive backtracking algorithm (BT) for the 6-queens problem. The labels on the nodes are shorthands for the set of assignments at that node. For example, the node labeled 25 consists of the set of assignments {x1 = 2, x2 = 5}. White dots denote nodes where all the constraints with no uninstantiated variables are satisfied (no pair of queens attacks each other). Black dots denote nodes where one or more constraint checks fail. (The reasons for the shading and dashed arrows are explained in Section 4.5.) For simplicity, I have assumed a static order of instantiation in which variable xi is always chosen at level i in the search tree and values are assigned to variables in the order 1, . . . , 6.