A Primer on Boolean Satisfiability
June 23, 2017
First steps to adding the magic of SAT to your problem-solving toolbox.
The SAT problem asks if a boolean formula has a satisfying assignment—a way to bind its variables to values that makes the formula evaluate to true.
SAT is a classic NP-complete problem. It’s easy to check the correctness of a solution (by evaluating the formula), but there is no known polynomial-time algorithm for finding one. If we had such an algorithm, we could solve every hard (NP) problem by efficient reduction to SAT. But as far as we know, the worst case scenario for deciding SAT involves trying all \(2^n\) possible assignments for a formula with \(n\) variables.
The good news is that the worst case happens surprisingly rarely in practice. Modern SAT solvers routinely defy the intractability of SAT to decide formulas with millions of variables and constraints. This has made SAT a go-to approach for solving hard problems in real-world applications such as configuring cars, verifying microchips and code, and flying space shuttles.
This post explains the basics of SAT and its applications. We’ll review the syntax and semantics of propositional (boolean) logic, define satisfiability more precisely, talk about methods for checking satisfiability, and see how to transform boolean formulas into CNF, the input language of SAT solvers. We’ll also build a tiny toy SAT solver based on DPLL, the algorithm at the core of all modern solvers.1
- Propositional Logic
- Satisfiability and Validity
- Checking Satisfiability and Validity
- Normal Forms
- SAT Solving with DPLL
- Next Steps
The complete source code for this post can be found here.
Propositional Logic
Syntax
A formula in propositional logic is made up of constants, variables, and logical connectives. There are two constant symbols, \(\T\) and \(\F\), and infinitely many variable identifiers, \(x_1, x_2, x_3,\ldots\). The logical connectives include the standard operators of boolean algebra: \(\Not\), \(\And\), \(\Or\), \(\Implies\), and \(\Iff\). Books and research papers often write atom to describe a variable or a constant, and literal to describe an atom or its negation.
- An atom is
- a constant \(\T\) (true) or \(\F\) (false)
- a variable \(x_1, x_2, x_3,\ldots\)
- A literal is an atom \(a\) or its negation \(\Not a\).
- A formula is an atom or the application of a logical connective to formulas:
- \(\Not f_1\) (not)
- \(f_1 \And f_2\) (and)
- \(f_1 \Or f_2\) (or)
- \(f_1 \Implies f_2\) (implies)
- \(f_1 \Iff f_2\) (iff).
Semantics
The meaning of a formula is a truth value \(\true\) or \(\false\). To determine a formula’s meaning, we evaluate it in an environment, called an interpretation, that assigns truth values to the formula’s variables. For example, \(\{x_1\mapsto\false, x_2\mapsto\true\}\) is one possible interpretation for the formula \(\neg x_1 \And x_2\).
We write \(I \satisfies f\) if the formula \(f\) evaluates to \(\true\) under the interpretation \(I\), and \(I \falsifies f\) otherwise. The symbols \(\satisfies\) and \(\falsifies\) are read as “satisfies” and “falsifies”, respectively. An interpretation that satisfies a formula is also called its model. The following rules show how to evaluate formulas:
- \(I \satisfies \T\) (the constant \(\T\) evaluates to \(\true\) under every interpretation)
- \(I \falsifies \F\) (the constant \(\F\) evaluates to \(\false\) under every interpretation)
- \(I \satisfies x_i\) if and only if \(I[x_i] = \true\)
- \(I \satisfies \Not f_1\) if and only if \(I \falsifies f_1\)
- \(I \satisfies f_1 \And f_2\) if and only if \(I \satisfies f_1\) and \(I \satisfies f_2\)
- \(I \satisfies f_1 \Or f_2\) if and only if \(I \satisfies f_1\) or \(I \satisfies f_2\)
- \(I \satisfies f_1 \Implies f_2\) if and only if \(I \falsifies f_1\) or \(I \satisfies f_2\)
- \(I \satisfies f_1 \Iff f_2\) if and only if either \(I \satisfies f_1\) and \(I \satisfies f_2\) or \(I \falsifies f_1\) and \(I \falsifies f_2\).
These rules operationalize the familiar truth table method for evaluating boolean expressions. They correspond closely to the code you’d write to implement an interpreter for propositional logic. For example, here is such an interpreter in Racket:
; Returns true (#t) if I ⊨ f and false (#f) if I ⊭ f.
(define (evaluate f I)
(match f
['⊤ true]
['⊥ false]
[(? symbol? xi) (lookup I xi)]
[`(¬ ,f1) (not (evaluate f1 I))]
[`(∧ ,fs ...) (for/and ([fi fs]) (evaluate fi I))]
[`(∨ ,fs ...) (for/or ([fi fs]) (evaluate fi I))]
[`(→ ,f1 ,f2) (or (not (evaluate f1 I)) (evaluate f2 I))]
[`(↔ ,f1 ,f2) (equal? (evaluate f1 I) (evaluate f2 I))]))
> ; { x1 ↦ false, x2 ↦ true } ⊨ ¬x1 ∧ x2
(evaluate '(∧ (¬ x1) x2) (assign 'x1 false 'x2 true))
#t
> ; { x1 ↦ true, x2 ↦ true } ⊭ ¬x1 ∧ x2
(evaluate '(∧ (¬ x1) x2) (assign 'x1 true 'x2 true))
#f
Satisfiability and Validity
Satisfiability
Given a way to evaluate formulas, we can now state what it means for a formula to be satisfiable:
A formula \(f\) is satisfiable if and only if there is an interpretation \(I\) under which \(f\) evaluates to \(\true\); that is, \(I \satisfies f\) for some \(I\). Otherwise, \(f\) is unsatisfiable.
Validity
Another important property of formulas is validity:
A formula \(f\) is valid if and only if it evaluates to \(\true\) under all interpretations; that is, \(I \satisfies f\) for all \(I\). We write \(\satisfies f\) to say that \(f\) is valid.
Duality
Satisfiability and validity are duals of each other:
A formula \(f\) is valid if and only if its negation \(\neg f\) is unsatisfiable.
This simple observation follows directly from the definitions of satisfiability and validity and the semantics of negation. It also has a key practical consequence:
If we have a procedure for checking satisfiability, we can use it to check validity, and vice versa.
As a result, a SAT solver is both a satisfiability checker and a theorem prover.
This is why we can use SAT (and SMT) solvers to build tools that reason about code. For example, to prove that a program correctly implements its specification, a verifier needs to check the validity of the formula \(P\Implies S\), where \(P\) encodes all program behaviors and \(S\) encodes the specification. Automated verifiers do this by using solvers to decide if \(P\And \Not S\), the negation of the verification formula, is satisfiable. If so, a satisfying assignment represents a concrete counterexample: a behavior of the program that violates the specification. If not, the program is correct.
Checking Satisfiability and Validity
So, how do we show that a formula is satisfiable or, dually, valid? There are two main approaches: search and deduction. As we’ll see later on, a SAT solver uses both.
Search
The search approach is straightforward. To decide if a formula is valid, we check that it evaluates to \(\true\) under every possible interpretation.
Using our Racket interpreter for propositional logic, we can implement the search approach as follows:
; Returns true (#t) if f is valid and false (#f) otherwise,
; by checking that every interpretation satisfies f.
(define (search-valid? f)
(let all-sat? ([vars (variables f)] [I (assign)])
(match vars
[(list)
(evaluate f I)]
[(list v vs ...)
(and (all-sat? vs (extend I v false))
(all-sat? vs (extend I v true)))])))
> ; ⊨ (x1 ∧ x2) → (x1 ∨ ¬x2)
(search-valid? '(→ (∧ x1 x2) (∨ x1 (¬ x2))))
#t
> ; ⊭ (x1 ∨ ¬x2) → (x1 ∧ x2)
(search-valid? '(→ (∨ x1 (¬ x2)) (∧ x1 x2)))
#f
Deduction
The deduction approach is more involved.
We start by assuming that the formula \(f\) is invalid; so, \(I \falsifies f\) for some \(I\).
Then, we apply proof rules for propositional logic to derive all possible consequences of our assumption. A proof rule consists of a premise and a conclusion. The premise states the facts that have to be established (either from the assumption or previous derivations) before we can apply the rule. The conclusion states the facts that we can derive (add to the set of known facts) as the result of applying the rule.
①\(\proofrule{I \satisfies \Not f}{I \falsifies f}\) | ②\(\proofrule{I \falsifies \Not f}{I \satisfies f}\) |
③\(\proofrule{I \satisfies f_1 \And f_2}{I \satisfies f_1, I\satisfies f_2}\) | ④\(\proofrule{I \falsifies f_1 \And f_2}{I \falsifies f_1 \,\vert\, I \falsifies f_2}\) |
⑤\(\proofrule{I \satisfies f_1 \Or f_2}{I \satisfies f_1 \,\vert\, I\satisfies f_2}\) | ⑥\(\proofrule{I \falsifies f_1 \Or f_2}{I \falsifies f_1, I \falsifies f_2}\) |
⑦\(\proofrule{I \satisfies f_1 \Implies f_2}{I \falsifies f_1 \,\vert\, I\satisfies f_2}\) | ⑧\(\proofrule{I \falsifies f_1 \Implies f_2}{I \satisfies f_1, I \falsifies f_2}\) |
⑨\(\proofrule{I \satisfies f_1 \Iff f_2}{I \satisfies f_1 \And f_2 \,\vert\, I \falsifies f_1 \Or f_2}\) | ⑩\(\proofrule{I \falsifies f_1 \Iff f_2}{I \satisfies f_1 \And \Not f_2 \,\vert\, I \satisfies \Not f_1 \And f_2}\) |
Applying proof rules produces a proof tree, with branches resulting from the introduction of alternative facts. If every branch of the proof tree contains a contradiction (e.g., both the fact \(I\satisfies x_i\) and the fact \(I\falsifies x_i\)), then we know that our initial assumption was untrue and \(f\) must be valid. Otherwise, \(f\) is invalid, and we can read off the falsifying interpretation \(I\) from the proof tree.
For example, we can prove that \((x_1 \And (x_1 \Implies x_2)) \Implies x_2\) is valid as follows:
- \(I \falsifies (x_1 \And (x_1 \Implies x_2)) \Implies x_2\) by assumption
- \(I \falsifies x_2\) by 1 and ⑧
- \(I \satisfies x_1 \And (x_1 \Implies x_2)\) by 1 and ⑧
- \(I \satisfies x_1\) by 3 and ③
- \(I \satisfies x_1 \Implies x_2\) by 3 and ③, branch
- \(I \falsifies x_1\) by 5 and ⑦, contradicts 4
- \(I \satisfies x_2\) by 5 and ⑦, contradicts 2
Here is an excerpt from an implementation of the deduction approach:
; Returns true (#t) if f is valid and false (#f) otherwise,
; by checking that all branches of f's proof tree are
; closed (i.e., contain a contradiction).
(define (deduce-valid? f)
(let all-closed? ([facts `((I ⊭ ,f))]) ; assumption
(match facts
[`(,gs ... (I ⊨ (¬ ,f1)) ,hs ...) ; (1)
(all-closed? `((I ⊭ ,f1) ,@gs ,@hs))]
; proof rules (2)-(10)
[(or `(,_ ... (I ⊨ ,fi) ,_ ... (I ⊭ ,fi) ,_ ...)
`(,_ ... (I ⊭ ,fi) ,_ ... (I ⊨ ,fi) ,_ ...))
(printf "Contradiction on ~a: ~a\n" fi facts)
#t]
[_
(printf "Open branch: ~a\n" facts)
#f])))
> ; ⊨ (x1 ∧ (x1 → x2)) → x2
(deduce-valid? '(→ (∧ x1 (→ x1 x2)) x2))
Contradiction on x1: ((I ⊭ x1) (I ⊨ x1) (I ⊭ x2))
Contradiction on x2: ((I ⊨ x2) (I ⊨ x1) (I ⊭ x2))
#t
> ; ⊭ (x1 ∨ ¬x2) → (x1 ∧ x2)
(deduce-valid? '(→ (∨ x1 (¬ x2)) (∧ x1 x2)))
Contradiction on x1: ((I ⊨ x1) (I ⊭ x1))
Open branch: ((I ⊭ x2) (I ⊭ x1))
#f
Normal Forms
As we just saw, generic search and deduction can decide the satisfiability (or validity) of arbitrary propositional formulas. But practical decision procedures, including SAT solvers, are designed to work on formulas with restricted syntactic structure.
These restrictions are called normal forms because every formula in a given logic has an equivalent formula in the normal form. Two formulas are equivalent if and only if they have the same meaning under every interpretation—that is, \(f_1 \Iff f_2\) is valid.
NNF, DNF, and CNF
There are three widely used normal forms for propositional logic.
- Negation Normal Form (NNF)
- A formula in NNF uses only \(\Not\), \(\And\), and \(\Or\) as connectives, with negations appearing only in literals. For example, \(x_1 \And \Not x_2\) is in NNF but \(\Not (x_1 \And x_2)\) is not.
- Disjunctive Normal Form (DNF)
- A formula in DNF is a disjunction of conjunctions of literals: \(\bigvee_i \left(\bigwedge_j l_{ij}\right)\), where \(l_{ij}\) is a literal. For example, \((x_1 \And \Not x_2) \Or (\Not x_3) \Or (x_5 \And x_6)\) is in DNF.
- Conjunctive Normal Form (CNF)
- A formula in CNF is a conjunction of disjunctions of literals: \(\bigwedge_i \left(\bigvee_j l_{ij}\right)\), where \(l_{ij}\) is a literal. For example, \((x_1 \Or \Not x_2) \And (\Not x_3) \And (x_5 \Or x_6)\) is in CNF. The disjunctive subformulas of a CNF formula, \(\bigvee_j l_{ij}\), are called clauses.
We can convert any propositional formula to these forms using simple rewrite rules.
; Converts f to NNF by first limiting operators
; to ∧∨¬ and then pushing ¬ down to the leaves
; (literals) of the formula.
(define (nnf f)
(nnf-neg (nnf-ops f)))
; Converts f to DNF by first converting f to NNF,
; then distributing ∧ over ∨, and finally flattening
; out nested conjunctions and disjunctions.
(define (dnf f)
(unnest (distribute '∧ '∨ (nnf f))))
; Converts f to CNF as done for the DNF conversion but
; swapping ∧ for ∨ and vice versa.
(define (cnf f)
(unnest (distribute '∨ '∧ (nnf f))))
> ; NNF: ¬ (x1 → (¬ (x1 ∧ x2)))
(nnf `(¬ (→ x1 (¬ (∧ x1 x2)))))
'(∧ x1 (∧ x1 x2))
> ; DNF: (x1 ∨ x2) ∧ (¬ (x3 ∧ x4))
(dnf `(∧ (∨ x1 x2) (¬ (∧ x3 x4))))
'(∨ (∧ x1 (¬ x3)) (∧ x2 (¬ x3)) (∧ x1 (¬ x4)) (∧ x2 (¬ x4)))
> ; CNF: (x1 ∧ x2) ∨ (¬ (x3 ∨ x4))
(cnf `(∨ (∧ x1 x2) (¬ (∨ x3 x4))))
'(∧ (∨ x1 (¬ x3)) (∨ x2 (¬ x3)) (∨ x1 (¬ x4)) (∨ x2 (¬ x4)))
Properties of normal forms
What makes NNF, DNF, and CNF practically interesting? CNF is the input language of SAT solvers. NNF is often used by SAT-based tools as an intermediate representation for optimizing formulas prior to CNF conversion. And DNF is easy to decide: we can decide the satisfiability of DNF formulas in linear time!2
So why do SAT solvers use CNF instead of DNF? After all, can’t we trivially decide the satisfiability of any propositional formula by first converting it to DNF?
The problem is that converting an arbitrary formula to an equivalent formula in DNF can lead to an exponential increase in formula size, as seen in the above code example. By using DNF, we’d just pay the exponential penalty in the conversion process instead of the decision process.
At this point, you might notice that the same kind of exponential explosion happens when converting a formula to an equivalent CNF representation. Given this, how is it possible to make effective use of SAT solvers, which take CNF formulas as input?
Equisatisfiability
The key is to observe that we don’t need to convert formulas to an equivalent CNF representation to check their satisfiability. Instead, all we need is to convert the input formula \(f\) to a CNF formula \(c\) such that \(f\) and \(c\) are either both satisfiable or both unsatisfiable. This relation between \(f\) and \(c\) is called equisatisfiability, and it’s a weaker notion than equivalence (for example, \(x_1\) and \(\Not x_1\) are equisatisfiable though not equivalent). But crucially, we have an efficient procedure for converting any propositional formula to an equisatisfiable CNF formula.
Tseitin’s encoding
To convert a formula to an equisatisfiable CNF representation, we use a procedure called Tseitin’s encoding. This procedure has two important properties. First, the size of the resulting CNF formula \(c\) is linear in the size of the original formula \(f\). Second, a satisfying interpretation for \(c\), if any, can be straightforwardly converted to a satisfying interpretation for \(f\). So, applying a SAT solver to Tseitin’s encoding of a formula lets us obtain not only a yes/no decision but also a satisfying interpretation if the solver produces one.
The key idea behind Tseitin’s encoding is to use new propositional variables, called auxiliary variables, to refer to the subformulas in the input formula \(f\).
We begin by introducing a fresh propositional variable \(a_i\) to represent the subformula \(f_i\). This involves creating a conjunction of formulas of the form \(a_i \Iff f_i\), and then replacing each occurrence of \(f_i\) with \(a_i\) in this conjunction.
For example, to transform \(x_1 \Implies (x_2 \And x_3)\), we generate the following conjunction, where \(a_1\) represents \(x_1 \Implies (x_2 \And x_3)\) and \(a_2\) represents \(x_2 \And x_3\):
\(a_1 \And (a_1 \Iff (x_1 \Implies a_2)) \And (a_2 \Iff (x_2 \And x_3))\)
Next, we convert the resulting conjunction to an equivalent CNF formula using
the cnf
procedure we saw previously. Continuing the above example, we’d
produce the following clauses, which are satisfiable if and only if the original
formula is satisfiable:
\(a_1 \,\And\) | |
\((\Not a_1 \Or \Not x_1 \Or a_2) \,\And\) | CNF representation of \(a_1 \Implies (x_1 \Implies a_2)\) |
\((x_1 \Or a_1) \And (\Not a_2 \Or a_1) \,\And\) | CNF representation of \((x_1 \Implies a_2) \Implies a_1\) |
\((\Not a_2 \Or x_2) \And (\Not a_2 \Or x_3) \,\And\) | CNF representation of \(a_2 \Implies (x_2 \And x_3)\) |
\((\Not x_2 \Or \Not x_3 \Or a_2)\) | CNF representation of \((x_2 \And x_3) \Implies a_2\) |
This conversion step won’t lead to exponential explosion because the introduction of auxiliary variables eliminated all nested subexpressions from the \(f_i\)’s. In particular, the initial transformation step ensures that the righthand side of each \(\Iff\) formula is an application of a logical connective to original or auxiliary variables.
Putting these two steps together, we get a basic implementation of Tseitin’s encoding:
; Converts f to an equisatisfiable formula in CNF
; using Tseitin's encoding.
(define (tseitin f)
(cnf (auxiliaries f)))
> ; Tseitin: x1 → (x2 ∧ x3)
(tseitin `(→ x1 (∧ x2 x3)))
'(∧
a6743
(∨ (¬ a6743) (¬ x1) a6744)
(∨ x1 a6743)
(∨ (¬ a6744) a6743)
(∨ x2 (¬ a6744))
(∨ x3 (¬ a6744))
(∨ (¬ x2) (¬ x3) a6744))
SAT Solving with DPLL
The DPLL algorithm
A SAT solver is a procedure for deciding if a given CNF formula is satisfiable. Most SAT solvers are based on the DPLL algorithm, invented in 1962 by Davis, Putnam, Logemann, and Loveland. DPLL is a sound and complete procedure for deciding the satisfiability of CNF formulas: it is guaranteed to terminate, and it outputs “yes” if and only if the input formula has a satisfying interpretation.
The key idea behind DPLL is to combine backtracking search with deduction, as shown in the following pseudocode:
1
2
3
4
5
6
DPLL(f):
g = BCP(f)
if g reduces to ⊤ then return true
if g reduces to ⊥ then return false
x = choose(vars(g))
return DPLL(g ∧ x) || DPLL(g ∧ ¬x)
Given a CNF formula \(f\), DPLL first performs deduction in the form of boolean constraint propagation (BCP). The BCP procedure applies unit resolution until fixed point:
\(\proofrule{l \quad c[l]}{\T}\) \(\proofrule{l \quad c[\Not l]}{c[\Not l \mapsto \F]}\)
The unit rule uses single-literal, or unit, clauses contained in \(f\) to make two simple deductions. First, if \(f\) includes both a unit clause \(l\) and a clause \(c\) that contains \(l\), then we can replace \(c\) with \(\T\) in \(f\). Second, if \(c\) contains the negation of \(l\), then we can replace \(\Not l\) with \(\F\) in \(c\).
It’s easy to see that BCP leads to valid conclusions (it’s sound) but that it can’t decide the satisfiability of arbitrary CNF formulas (it’s incomplete).
To ensure completeness, DPLL employs backtracking search (lines 5-6). If BCP doesn’t manage to reduce \(f\) to a constant, then we pick one of the remaining variables, and call DPLL recursively assuming that the variable is either \(\true\) or \(\false\). If the formula is satisfiable under either assumption, then we know that it has a satisfying assignment (expressed in the assumptions). Otherwise, the formula is unsatisfiable.
A toy SAT solver
While competitive SAT solvers make many improvements to DPLL to make it scale, even a basic implementation of the algorithm is quick to solve problems with a hundred or so variables and clauses:
; If f is satisfiable, returns a model of f given as a list of
; integers (positive or negative literals). Otherwise returns false.
(define (dpll f)
(define g (bcp f))
(match g
[`((,lit) ...) lit] ; model
[`(,_ ... () ,_ ...) #f] ; unsat
[`((,assigned) ... (,lit ,_ ,_ ...) ,_ ...) ; search
(let* ([undecided (drop g (length assigned))]
[result (or (dpll `((,lit) ,@undecided))
(dpll `((,(- lit)) ,@undecided)))])
(and result `(,@assigned ,@result)))]))
The above implementation represents a CNF formula as a list of clauses. A clause
is represented as a list of integers, where positive and negative integers stand
for positive and negative literals, respectively. For example, we represent the
formula \((x_1 \Or \Not x_3) \And (\Not x_1 \Or x_2 \Or x_3)\) as the list '((1
-3) (-1 2 3))
. The empty list represents the empty clause, which evaluates to
\(\false\) under all interpretations.
Applying our toy solver to this problem, which contains 50 variables and 80 clauses, produces a model in a millisecond:
> (define f (read-cnf "example.cnf"))
> (time (dpll f))
cpu time: 1 real time: 1 gc time: 0
'( 16 -17 -21 -28 -13 ...)
A naive backtracking search, on the other hand, doesn’t come back after two hours when applied to the same problem:
; If f is satisfiable, returns a model of f given as a list of
; integers (positive or negative literals). Otherwise returns false.
(define (search-sat f)
(let search ([vars (variables f)] [I (list)])
(if (evaluate f I)
I
(match vars
[(list) #f]
[(list v vs ...)
(or (search vs (cons v I))
(search vs (cons (- v) I)))]))))
> (define f (read-cnf "example.cnf"))
> (time
(with-handlers ([exn? (lambda (e) (displayln (exn-message e)))])
(search-sat f)))
user break
cpu time: 10007420 real time: 10003982 gc time: 6017
Next Steps
To learn more about how competitive SAT solvers work, see these excellent review papers:
- [1] Joao Marques-Silva, Ines Lynce, and Sharad Malik. Chapter 4: Conflict-Driven Clause Learning SAT Solvers. Handbook of Satisfiability. 2008.
- [2] Carla P. Gomes, Henry Kautz, Ashish Sabharwal, and Bart Selman. Chapter 2: Satisfiability Solvers. Handbook of Knowledge Representation. 2008.
Or take a look at the source code for MiniSAT, on which most other solvers are based.
To choose the best solver for your problem, check out the results from the latest SAT competition.
Happy SAT solving!
-
This post is based on the SAT lecture from my graduate course on Computer-Aided Reasoning for Software. The lecture covers the material in Chapter 1 of The Calculus of Computation and Decision Procedures. ↩
-
To decide the satisfiability of a DNF formula, we scan each of the subformulas \(\bigwedge_j l_{ij}\) to find one that doesn’t include both a variable and its negation, or the constant \(\F\). An interpretation that satisfies this subformula (by assigning \(\true\) and \(\false\) to the non-negated and negated variables, respectively) satisfies the entire formula. ↩