Clausal Normal Form (CNF): From Formulas to Clauses
TL;DR: Clausal Normal Form (CNF) is a standardized formula format essential for SAT solvers and automated theorem proving. This guide covers the definition, conversion algorithms, practical examples, and applications.
Introduction
In propositional logic, different formula structures can represent the same logical meaning. Clausal Normal Form (CNF), also called Conjunctive Normal Form, standardizes formulas into a specific structure: a conjunction (AND) of disjunctions (OR).
This normalization is critical for:
- SAT solvers that check formula satisfiability
- Automated theorem proving via resolution
- Hardware verification and model checking
- Constraint satisfaction problems
- AI planning and scheduling
- Formal verification of software systems
Consider this analogy: just as JSON standardizes data exchange between systems, CNF standardizes logical formulas for automated reasoning tools. Without this standardization, every SAT solver would need to handle arbitrary formula structures, making implementation complex and inefficient.
Definitions
Literal
A literal is either a propositional variable or its negation:
- Positive literal:
P,Q,R - Negative literal:
¬P,¬Q,¬R
Example in Python:
class Literal:
def __init__(self, variable, negated=False):
self.variable = variable
self.negated = negated
def __repr__(self):
return f"{'¬' if self.negated else ''}{self.variable}"
# Create literals
p = Literal('P') # P
not_p = Literal('P', True) # ¬P
Clause
A clause is a disjunction (OR) of literals:
P ∨ Q ∨ ¬R
Special cases:
- Unit clause: Contains exactly one literal (e.g.,
P) - Binary clause: Contains exactly two literals (e.g.,
P ∨ Q) - Empty clause: Contains zero literals (⊥, always false)
- Tautology: Contains both
Pand¬P(always true, often removed)
Example in Python:
class Clause:
def __init__(self, literals):
self.literals = literals
def is_unit(self):
return len(self.literals) == 1
def is_empty(self):
return len(self.literals) == 0
def __repr__(self):
return ' ∨ '.join(str(lit) for lit in self.literals)
# Create clauses
clause1 = Clause([Literal('P'), Literal('Q')]) # P ∨ Q
unit = Clause([Literal('R', True)]) # ¬R (unit clause)
Clausal Normal Form (CNF)
A formula is in CNF if it is a conjunction (AND) of clauses:
(P ∨ Q) ∧ (¬P ∨ R) ∧ (¬Q ∨ ¬R)
Example in Python:
class CNF:
def __init__(self, clauses):
self.clauses = clauses
def __repr__(self):
return ' ∧ '.join(f"({c})" for c in self.clauses)
# Create CNF formula
cnf = CNF([
Clause([Literal('P'), Literal('Q')]),
Clause([Literal('P', True), Literal('R')]),
Clause([Literal('Q', True), Literal('R', True)])
])
print(cnf) # (P ∨ Q) ∧ (¬P ∨ R) ∧ (¬Q ∨ ¬R)
Why CNF Matters
| Application | Why CNF is Used |
|---|---|
| SAT Solvers | Input format for DPLL and CDCL algorithms |
| Theorem Proving | Resolution works directly on clauses |
| Model Checking | Efficient state space traversal |
| Circuit Verification | Boolean circuit equivalence checking |
Converting to CNF
Step-by-Step Algorithm
The standard CNF conversion algorithm follows these steps:
Step 1: Eliminate implications and biconditionals
P → Q ≡ ¬P ∨ Q
P ↔ Q ≡ (P → Q) ∧ (Q → P) ≡ (¬P ∨ Q) ∧ (¬Q ∨ P)
Example in Python:
def eliminate_implications(formula):
"""Convert implications to disjunctions."""
if formula['op'] == '→':
left = eliminate_implications(formula['left'])
right = eliminate_implications(formula['right'])
return {'op': '∨', 'left': {'op': '¬', 'arg': left}, 'right': right}
elif formula['op'] == '↔':
left = eliminate_implications(formula['left'])
right = eliminate_implications(formula['right'])
# (P ↔ Q) ≡ (¬P ∨ Q) ∧ (¬Q ∨ P)
return {
'op': '∧',
'left': {'op': '∨', 'left': {'op': '¬', 'arg': left}, 'right': right},
'right': {'op': '∨', 'left': {'op': '¬', 'arg': right}, 'right': left}
}
return formula
Step 2: Move negation inward (using De Morgan’s laws)
¬(P ∧ Q) ≡ ¬P ∨ ¬Q
¬(P ∨ Q) ≡ ¬P ∧ ¬Q
¬¬P ≡ P
Example in Python:
def push_negation_inward(formula):
"""Apply De Morgan's laws to push negations to literals."""
if formula['op'] == '¬':
arg = formula['arg']
if arg['op'] == '¬':
# ¬¬P ≡ P
return push_negation_inward(arg['arg'])
elif arg['op'] == '∧':
# ¬(P ∧ Q) ≡ ¬P ∨ ¬Q
return {
'op': '∨',
'left': push_negation_inward({'op': '¬', 'arg': arg['left']}),
'right': push_negation_inward({'op': '¬', 'arg': arg['right']})
}
elif arg['op'] == '∨':
# ¬(P ∨ Q) ≡ ¬P ∧ ¬Q
return {
'op': '∧',
'left': push_negation_inward({'op': '¬', 'arg': arg['left']}),
'right': push_negation_inward({'op': '¬', 'arg': arg['right']})
}
return formula
Step 3: Distribute OR over AND
P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R)
(P ∧ Q) ∨ R ≡ (P ∨ R) ∧ (Q ∨ R)
Example in Python:
def distribute_or_over_and(formula):
"""Distribute OR over AND to achieve CNF."""
if formula['op'] == '∨':
left = distribute_or_over_and(formula['left'])
right = distribute_or_over_and(formula['right'])
# P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R)
if right['op'] == '∧':
return {
'op': '∧',
'left': distribute_or_over_and({'op': '∨', 'left': left, 'right': right['left']}),
'right': distribute_or_over_and({'op': '∨', 'left': left, 'right': right['right']})
}
# (P ∧ Q) ∨ R ≡ (P ∨ R) ∧ (Q ∨ R)
elif left['op'] == '∧':
return {
'op': '∧',
'left': distribute_or_over_and({'op': '∨', 'left': left['left'], 'right': right}),
'right': distribute_or_over_and({'op': '∨', 'left': left['right'], 'right': right})
}
return formula
Complete CNF Conversion:
def to_cnf(formula):
"""Convert formula to CNF."""
formula = eliminate_implications(formula)
formula = push_negation_inward(formula)
formula = distribute_or_over_and(formula)
return formula
Examples
Example 1: Simple Formula
Convert: P → Q
Step 1: Remove implication
¬P ∨ Q
This is already in CNF (one clause: ¬P ∨ Q).
Example 2: With AND
Convert: P ∧ (P → Q)
Step 1: Remove implication
P ∧ (¬P ∨ Q)
Step 2: Already has negation pushed inward.
Step 3: Already in CNF:
- Clause 1:
P - Clause 2:
¬P ∨ Q
Example 3: Complex Formula
Convert: ¬(P ∨ Q) → R
Step 1: Remove implication
¬¬(P ∨ Q) ∨ R
≡ (P ∨ Q) ∨ R
≡ P ∨ Q ∨ R
Result: Single clause (P ∨ Q ∨ R) ✓
Example 4: Biconditional
Convert: (P ↔ Q) ∧ ¬R
Step 1: Remove ↔ (biconditional)
(P → Q) ∧ (Q → P) ∧ ¬R
≡ (¬P ∨ Q) ∧ (¬Q ∨ P) ∧ ¬R
Result: Three clauses:
(¬P ∨ Q)(¬Q ∨ P)¬R
Example 5: With Distribution
Convert: (P ∧ Q) ∨ R
Step 3: Distribute OR over AND
(P ∨ R) ∧ (Q ∨ R)
Result: Two clauses in CNF.
Example 6: Nested Formula
Convert: ¬((P → Q) ∧ (Q → R))
Step 1: Remove implications
¬((¬P ∨ Q) ∧ (¬Q ∨ R))
Step 2: Apply De Morgan’s law
¬(¬P ∨ Q) ∨ ¬(¬Q ∨ R)
≡ (¬¬P ∧ ¬Q) ∨ (¬¬Q ∧ ¬R)
≡ (P ∧ ¬Q) ∨ (Q ∧ ¬R)
Step 3: Distribute OR over AND
(P ∨ Q) ∧ (P ∨ ¬R) ∧ (¬Q ∨ Q) ∧ (¬Q ∨ ¬R)
Simplify (remove tautology ¬Q ∨ Q):
(P ∨ Q) ∧ (P ∨ ¬R) ∧ (¬Q ∨ ¬R)
Example 7: Three-Variable Formula
Convert: (P ∨ Q) → (R ∧ S)
Step 1: Remove implication
¬(P ∨ Q) ∨ (R ∧ S)
Step 2: Push negation inward
(¬P ∧ ¬Q) ∨ (R ∧ S)
Step 3: Distribute OR over AND
((¬P ∧ ¬Q) ∨ R) ∧ ((¬P ∧ ¬Q) ∨ S)
≡ (¬P ∨ R) ∧ (¬Q ∨ R) ∧ (¬P ∨ S) ∧ (¬Q ∨ S)
Result: Four clauses in CNF.
Python implementation:
# Example: Convert (P ∨ Q) → (R ∧ S)
formula = {
'op': '→',
'left': {'op': '∨', 'left': {'var': 'P'}, 'right': {'var': 'Q'}},
'right': {'op': '∧', 'left': {'var': 'R'}, 'right': {'var': 'S'}}
}
cnf_formula = to_cnf(formula)
# Result: (¬P ∨ R) ∧ (¬Q ∨ R) ∧ (¬P ∨ S) ∧ (¬Q ∨ S)
Tseitin Transformation
The Tseitin transformation is a more efficient way to convert formulas to CNF by introducing fresh variables for subformulas. This avoids the exponential blowup that can occur with naive distribution.
Why Tseitin?
Consider converting (A₁ ∧ B₁) ∨ (A₂ ∧ B₂) ∨ ... ∨ (Aₙ ∧ Bₙ):
- Naive approach: Results in 2ⁿ clauses (exponential)
- Tseitin approach: Results in O(n) clauses (linear)
Tseitin Encoding Rules
For each subformula, introduce a new variable and add equivalence clauses:
For conjunction (X ↔ (A ∧ B)):
(X → (A ∧ B)) ∧ ((A ∧ B) → X)
≡ (¬X ∨ A) ∧ (¬X ∨ B) ∧ (¬A ∨ ¬B ∨ X)
For disjunction (X ↔ (A ∨ B)):
(X → (A ∨ B)) ∧ ((A ∨ B) → X)
≡ (¬X ∨ A ∨ B) ∧ (¬A ∨ X) ∧ (¬B ∨ X)
For negation (X ↔ ¬A):
(¬X ∨ ¬A) ∧ (A ∨ X)
Example: Tseitin Transformation
Convert: (P ∧ Q) ∨ (R ∧ S)
Step 1: Introduce variables
- Let
X₁ ≡ P ∧ Q - Let
X₂ ≡ R ∧ S - Let
X₃ ≡ X₁ ∨ X₂(the root)
Step 2: Add equivalence clauses
For X₁ ↔ (P ∧ Q):
(¬X₁ ∨ P) ∧ (¬X₁ ∨ Q) ∧ (¬P ∨ ¬Q ∨ X₁)
For X₂ ↔ (R ∧ S):
(¬X₂ ∨ R) ∧ (¬X₂ ∨ S) ∧ (¬R ∨ ¬S ∨ X₂)
For X₃ ↔ (X₁ ∨ X₂):
(¬X₃ ∨ X₁ ∨ X₂) ∧ (¬X₁ ∨ X₃) ∧ (¬X₂ ∨ X₃)
Step 3: Assert root variable
X₃
Final CNF (9 clauses total):
X₃ ∧
(¬X₁ ∨ P) ∧ (¬X₁ ∨ Q) ∧ (¬P ∨ ¬Q ∨ X₁) ∧
(¬X₂ ∨ R) ∧ (¬X₂ ∨ S) ∧ (¬R ∨ ¬S ∨ X₂) ∧
(¬X₃ ∨ X₁ ∨ X₂) ∧ (¬X₁ ∨ X₃) ∧ (¬X₂ ∨ X₃)
Python implementation:
class TseitinTransformer:
def __init__(self):
self.var_counter = 0
self.clauses = []
def new_var(self):
self.var_counter += 1
return f"X{self.var_counter}"
def transform(self, formula):
"""Transform formula to CNF using Tseitin."""
root_var = self.tseitin_encode(formula)
# Assert root variable
self.clauses.append([root_var])
return self.clauses
def tseitin_encode(self, formula):
"""Recursively encode formula."""
if 'var' in formula:
return formula['var']
op = formula['op']
if op == '∧':
left_var = self.tseitin_encode(formula['left'])
right_var = self.tseitin_encode(formula['right'])
new_var = self.new_var()
# X ↔ (A ∧ B)
self.clauses.append([f"¬{new_var}", left_var])
self.clauses.append([f"¬{new_var}", right_var])
self.clauses.append([f"¬{left_var}", f"¬{right_var}", new_var])
return new_var
elif op == '∨':
left_var = self.tseitin_encode(formula['left'])
right_var = self.tseitin_encode(formula['right'])
new_var = self.new_var()
# X ↔ (A ∨ B)
self.clauses.append([f"¬{new_var}", left_var, right_var])
self.clauses.append([f"¬{left_var}", new_var])
self.clauses.append([f"¬{right_var}", new_var])
return new_var
elif op == '¬':
arg_var = self.tseitin_encode(formula['arg'])
new_var = self.new_var()
# X ↔ ¬A
self.clauses.append([f"¬{new_var}", f"¬{arg_var}"])
self.clauses.append([arg_var, new_var])
return new_var
# Example usage
transformer = TseitinTransformer()
formula = {
'op': '∨',
'left': {'op': '∧', 'left': {'var': 'P'}, 'right': {'var': 'Q'}},
'right': {'op': '∧', 'left': {'var': 'R'}, 'right': {'var': 'S'}}
}
cnf_clauses = transformer.transform(formula)
Advantages
| Aspect | Naive CNF | Tseitin |
|---|---|---|
| Size | Exponential (worst case) | Linear (always) |
| Variables | Original only | Additional introduced |
| Preserves | Logical equivalence | Equisatisfiability |
| Use case | Small formulas | Large formulas |
Note: Tseitin preserves satisfiability but not equivalence. The new formula is satisfiable if and only if the original is, but they may have different models due to the auxiliary variables.
CNF in SAT Solvers
DIMACS Format
Modern SAT solvers like CDCL (Conflict-Driven Clause Learning) work on CNF. The standard input format is DIMACS:
c This is a comment
c Variables: 1=P, 2=Q, 3=R
p cnf 3 5
-1 -2 0
1 -2 0
-1 2 0
1 -3 0
-1 3 0
Format explanation:
p cnf 3 5: Problem line (3 variables, 5 clauses)- Each line is a clause: literals separated by spaces, terminated by
0 - Positive number = positive literal, negative = negated literal
-1 -2 0represents(¬P ∨ ¬Q)
This represents:
(¬P ∨ ¬Q) ∧ (P ∨ ¬Q) ∧ (¬P ∨ Q) ∧ (P ∨ ¬R) ∧ (¬P ∨ R)
Python DIMACS writer:
def write_dimacs(cnf_clauses, num_vars, filename):
"""Write CNF to DIMACS format."""
with open(filename, 'w') as f:
f.write(f"p cnf {num_vars} {len(cnf_clauses)}\n")
for clause in cnf_clauses:
clause_str = ' '.join(str(lit) for lit in clause) + ' 0\n'
f.write(clause_str)
# Example
clauses = [[-1, -2], [1, -2], [-1, 2], [1, -3], [-1, 3]]
write_dimacs(clauses, 3, 'formula.cnf')
Python DIMACS reader:
def read_dimacs(filename):
"""Read CNF from DIMACS format."""
clauses = []
with open(filename, 'r') as f:
for line in f:
if line.startswith('c') or line.startswith('p'):
continue
clause = [int(x) for x in line.strip().split() if x != '0']
if clause:
clauses.append(clause)
return clauses
# Example
cnf = read_dimacs('formula.cnf')
print(cnf) # [[-1, -2], [1, -2], [-1, 2], [1, -3], [-1, 3]]
Unit Propagation
In CNF, unit clauses force variable assignments:
(P) ∧ (¬P ∨ Q) ∧ (¬Q ∨ R)
Propagation steps:
- From
(P), we deriveP = true - Substitute into
(¬P ∨ Q): becomes(false ∨ Q)=(Q) - From
(Q), we deriveQ = true - Substitute into
(¬Q ∨ R): becomes(false ∨ R)=(R) - From
(R), we deriveR = true
Python implementation:
def unit_propagate(clauses, assignment=None):
"""Perform unit propagation on CNF clauses."""
if assignment is None:
assignment = {}
changed = True
while changed:
changed = False
for clause in clauses:
# Filter out satisfied literals
unassigned = [lit for lit in clause
if abs(lit) not in assignment]
# Check if clause is unit
if len(unassigned) == 1:
lit = unassigned[0]
var = abs(lit)
value = lit > 0
if var not in assignment:
assignment[var] = value
changed = True
print(f"Unit propagation: {var} = {value}")
return assignment
# Example
clauses = [[1], [-1, 2], [-2, 3]]
result = unit_propagate(clauses)
print(result) # {1: True, 2: True, 3: True}
Pure Literal Elimination
A pure literal appears with only one polarity in all clauses. Pure literals can be assigned to satisfy all clauses containing them.
(P ∨ Q) ∧ (P ∨ ¬R) ∧ (Q ∨ R)
Here, P is pure (always positive), so we can set P = true.
Python implementation:
def find_pure_literals(clauses):
"""Find pure literals in CNF."""
positive = set()
negative = set()
for clause in clauses:
for lit in clause:
if lit > 0:
positive.add(lit)
else:
negative.add(-lit)
# Pure literals appear in only one polarity
pure = (positive - negative) | (negative - positive)
return pure
# Example
clauses = [[1, 2], [1, -3], [2, 3]]
pure = find_pure_literals(clauses)
print(pure) # {1} (P is pure positive)
Horn Clauses
A Horn clause contains at most one positive literal. Horn clauses are important because they can be solved in polynomial time.
Types of Horn Clauses
Definite clause (exactly one positive literal):
¬P₁ ∨ ¬P₂ ∨ ... ∨ ¬Pₙ ∨ Q
Interpreted as: “if P₁ and P₂ and … and Pₙ then Q”
Goal clause (no positive literals):
¬P₁ ∨ ¬P₂ ∨ ... ∨ ¬Pₙ
Interpreted as: “not (P₁ and P₂ and … and Pₙ)”
Fact (one positive literal, no negative):
Q
Interpreted as: “Q is true”
Example: Horn Formula
(¬A ∨ ¬B ∨ C) ∧ (¬C ∨ D) ∧ (A) ∧ (B)
Interpretation:
- If A and B then C
- If C then D
- A is true
- B is true
Forward chaining solution:
- A = true (fact)
- B = true (fact)
- C = true (from A ∧ B → C)
- D = true (from C → D)
Python Horn clause solver:
def solve_horn(clauses):
"""Solve Horn formula using forward chaining."""
assignment = {}
facts = []
rules = []
# Separate facts from rules
for clause in clauses:
positive = [lit for lit in clause if lit > 0]
negative = [abs(lit) for lit in clause if lit < 0]
if len(positive) == 1 and len(negative) == 0:
# Fact
facts.append(positive[0])
elif len(positive) == 1:
# Rule: negative literals → positive literal
rules.append((negative, positive[0]))
# Initialize with facts
for fact in facts:
assignment[fact] = True
# Forward chaining
changed = True
while changed:
changed = False
for premises, conclusion in rules:
if conclusion not in assignment:
# Check if all premises are satisfied
if all(p in assignment and assignment[p] for p in premises):
assignment[conclusion] = True
changed = True
print(f"Derived: {conclusion} = True")
return assignment
# Example: (¬1 ∨ ¬2 ∨ 3) ∧ (¬3 ∨ 4) ∧ (1) ∧ (2)
clauses = [[-1, -2, 3], [-3, 4], [1], [2]]
result = solve_horn(clauses)
print(result) # {1: True, 2: True, 3: True, 4: True}
Why Horn Clauses Matter
| Property | General CNF | Horn CNF |
|---|---|---|
| Satisfiability | NP-complete | P (polynomial) |
| Algorithm | DPLL/CDCL | Forward chaining |
| Use cases | General SAT | Logic programming, expert systems |
| Example systems | SAT solvers | Prolog, Datalog |
Practical Applications
1. Circuit Verification
Hardware circuits can be encoded as CNF formulas to verify equivalence or find bugs.
Example: XOR gate verification
def xor_to_cnf(a, b, out):
"""Encode XOR gate as CNF: out = a ⊕ b"""
# out ↔ (a ⊕ b)
# out ↔ ((a ∨ b) ∧ ¬(a ∧ b))
# out ↔ ((a ∨ b) ∧ (¬a ∨ ¬b))
clauses = [
# out → ((a ∨ b) ∧ (¬a ∨ ¬b))
[-out, a, b],
[-out, -a, -b],
# ((a ∨ b) ∧ (¬a ∨ ¬b)) → out
[-a, -b, out],
[a, b, out]
]
return clauses
# Verify two XOR implementations are equivalent
xor1_clauses = xor_to_cnf(1, 2, 3) # out1 = a ⊕ b
xor2_clauses = xor_to_cnf(1, 2, 4) # out2 = a ⊕ b
equivalence = [[-3, 4], [3, -4]] # out1 ↔ out2
all_clauses = xor1_clauses + xor2_clauses + equivalence
# If UNSAT, circuits are equivalent
2. Planning Problems
AI planning problems can be encoded as CNF to find action sequences.
Example: Block world planning
def encode_block_world(initial, goal, steps):
"""Encode block world planning as CNF."""
clauses = []
# Variables: on(block, location, time)
# Actions: move(block, from, to, time)
# Initial state
for fact in initial:
clauses.append([fact])
# Goal state (at final time step)
for fact in goal:
clauses.append([fact])
# Frame axioms: if block doesn't move, position stays same
# Action preconditions and effects
# Mutual exclusion: block can't be in two places
return clauses
# Example encoding
initial = [1, 2] # on(A, table, 0), on(B, table, 0)
goal = [3] # on(A, B, 2)
clauses = encode_block_world(initial, goal, steps=2)
3. Cryptography
SAT solvers can attack cryptographic problems by encoding them as CNF.
Example: Hash collision search
def encode_hash_collision(hash_function, target_hash):
"""Encode hash collision problem as CNF."""
clauses = []
# Variables for input bits
input_vars = list(range(1, 257)) # 256-bit input
# Encode hash function as circuit
hash_output = encode_circuit(hash_function, input_vars)
# Constrain output to match target
for i, bit in enumerate(target_hash):
if bit == 1:
clauses.append([hash_output[i]])
else:
clauses.append([-hash_output[i]])
return clauses
4. Sudoku Solver
Sudoku can be elegantly encoded as CNF.
Example: Sudoku encoding
def encode_sudoku(grid):
"""Encode 9x9 Sudoku as CNF."""
clauses = []
# Variable: x_ijk means cell (i,j) has value k
# Variable numbering: 100*i + 10*j + k
def var(i, j, k):
return 100*i + 10*j + k
# Each cell has at least one value
for i in range(1, 10):
for j in range(1, 10):
clauses.append([var(i, j, k) for k in range(1, 10)])
# Each cell has at most one value
for i in range(1, 10):
for j in range(1, 10):
for k1 in range(1, 10):
for k2 in range(k1+1, 10):
clauses.append([-var(i, j, k1), -var(i, j, k2)])
# Each row has all values
for i in range(1, 10):
for k in range(1, 10):
clauses.append([var(i, j, k) for j in range(1, 10)])
# Each column has all values
for j in range(1, 10):
for k in range(1, 10):
clauses.append([var(i, j, k) for i in range(1, 10)])
# Each 3x3 box has all values
for box_i in range(3):
for box_j in range(3):
for k in range(1, 10):
clause = []
for i in range(1, 4):
for j in range(1, 4):
clause.append(var(3*box_i + i, 3*box_j + j, k))
clauses.append(clause)
# Add given values
for i in range(1, 10):
for j in range(1, 10):
if grid[i-1][j-1] != 0:
k = grid[i-1][j-1]
clauses.append([var(i, j, k)])
return clauses
# Example usage
grid = [
[5, 3, 0, 0, 7, 0, 0, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
# ... rest of grid
]
sudoku_cnf = encode_sudoku(grid)
5. Graph Coloring
The graph k-coloring problem: assign k colors to vertices such that no adjacent vertices share a color.
Example: 3-coloring encoding
def encode_graph_coloring(edges, num_vertices, num_colors):
"""Encode graph k-coloring as CNF."""
clauses = []
# Variable: x_vc means vertex v has color c
def var(v, c):
return v * num_colors + c + 1
# Each vertex has at least one color
for v in range(num_vertices):
clauses.append([var(v, c) for c in range(num_colors)])
# Each vertex has at most one color
for v in range(num_vertices):
for c1 in range(num_colors):
for c2 in range(c1+1, num_colors):
clauses.append([-var(v, c1), -var(v, c2)])
# Adjacent vertices have different colors
for v1, v2 in edges:
for c in range(num_colors):
clauses.append([-var(v1, c), -var(v2, c)])
return clauses
# Example: Triangle graph with 3 colors
edges = [(0, 1), (1, 2), (2, 0)]
cnf = encode_graph_coloring(edges, 3, 3)
Common Pitfalls and Best Practices
Pitfall 1: Exponential Blowup
Bad: Naively distributing OR over AND
# (A₁ ∧ B₁) ∨ (A₂ ∧ B₂) ∨ ... ∨ (Aₙ ∧ Bₙ)
# Results in 2ⁿ clauses!
Good: Use Tseitin transformation
# Introduces O(n) auxiliary variables
# Results in O(n) clauses
transformer = TseitinTransformer()
cnf = transformer.transform(formula)
Pitfall 2: Not Simplifying Tautologies
Bad: Keep tautological clauses
clauses = [[1, -1, 2], [3, 4]] # First clause is always true
Good: Remove tautologies
def remove_tautologies(clauses):
"""Remove clauses that are always true."""
result = []
for clause in clauses:
literals = set(clause)
# Check if both P and ¬P exist
is_tautology = any(-lit in literals for lit in literals)
if not is_tautology:
result.append(clause)
return result
Pitfall 3: Duplicate Literals
Bad: Keep duplicate literals in clauses
clause = [1, 2, 1, 3] # P ∨ Q ∨ P ∨ R
Good: Remove duplicates
def simplify_clause(clause):
"""Remove duplicate literals."""
return list(set(clause))
Best Practice: Incremental CNF Construction
Build CNF incrementally for better performance:
class IncrementalCNF:
def __init__(self):
self.clauses = []
self.var_counter = 0
def new_var(self):
self.var_counter += 1
return self.var_counter
def add_clause(self, clause):
"""Add clause with simplification."""
clause = list(set(clause)) # Remove duplicates
# Check for tautology
if any(-lit in clause for lit in clause):
return # Don't add tautology
self.clauses.append(clause)
def add_and(self, a, b):
"""Add clauses for c ↔ (a ∧ b)."""
c = self.new_var()
self.add_clause([-c, a])
self.add_clause([-c, b])
self.add_clause([-a, -b, c])
return c
def add_or(self, a, b):
"""Add clauses for c ↔ (a ∨ b)."""
c = self.new_var()
self.add_clause([-c, a, b])
self.add_clause([-a, c])
self.add_clause([-b, c])
return c
Summary
| Concept | Definition |
|---|---|
| Literal | P or ¬P |
| Clause | Disjunction of literals: P ∨ Q ∨ ¬R |
| CNF | Conjunction of clauses |
| Empty clause | ⊥ (always false) |
| Unit clause | Single literal |
| Horn clause | At most one positive literal |
| Tautology | Clause with both P and ¬P |
Key Points:
- CNF standardizes formula structure for algorithmic processing
- Conversion can be done naively (potentially exponential) or via Tseitin transformation (linear)
- SAT solvers operate on CNF formulas using DIMACS format
- Horn clauses are a tractable subset solvable in polynomial time
- Unit propagation and pure literal elimination are key simplification techniques
- Many real-world problems (Sudoku, graph coloring, planning) can be encoded as CNF
Further Reading
- Boolean Satisfiability (SAT) Problem
- Modern SAT Solvers
- Resolution and Refutation
- Tseitin’s Transformation
Comments