Understanding Conflict-Driven Clause Learning (CDCL)
When you’re trying to solve a Boolean satisfiability problem—determining whether a logical formula can be satisfied by assigning true or false to its variables—you need an algorithm that’s both smart and efficient. Enter CDCL (Conflict-Driven Clause Learning), the algorithm that transformed SAT solving from a theoretical curiosity into a practical tool powering everything from chip design verification to automated planning systems.
In this comprehensive guide, we’ll explore CDCL from the ground up, examining its core components, implementation patterns, and the techniques that make modern SAT solvers capable of handling millions of variables and clauses. Whether you’re building your own solver or simply want to understand how these powerful tools work, this article will give you the deep understanding you need.
What is CDCL?
CDCL is a complete SAT solving algorithm that extends the classic DPLL (Davis-Putnam-Logemann-Loveland) algorithm with two critical innovations: conflict analysis and clause learning. Rather than simply backtracking when it hits a dead end, CDCL analyzes why the conflict occurred and learns a new clause that prevents the same mistake from happening again.
Think of it like debugging code. When you encounter a bug, you don’t just revert your changes randomly—you analyze what went wrong and add a check to prevent that specific issue in the future. CDCL does exactly this for SAT problems.
SAT Problem Basics
Before diving into CDCL, let’s establish the fundamentals. A SAT problem consists of:
- Variables: Boolean variables (x₁, x₂, x₃, …) that can be true or false
- Literals: A variable or its negation (x₁ or ¬x₁)
- Clauses: Disjunctions of literals ((x₁ ∨ ¬x₂ ∨ x₃))
- CNF Formula: Conjunction of clauses
# Good: Clear representation of SAT problem
class Literal:
def __init__(self, variable: int, negated: bool = False):
self.variable = variable
self.negated = negated
def __repr__(self):
return f"{'¬' if self.negated else ''}x{self.variable}"
class Clause:
def __init__(self, literals: list[Literal]):
self.literals = literals
def __repr__(self):
return f"({' ∨ '.join(str(lit) for lit in self.literals)})"
# Example: (x₁ ∨ ¬x₂ ∨ x₃)
clause = Clause([
Literal(1, negated=False),
Literal(2, negated=True),
Literal(3, negated=False)
])
# Bad: Unclear representation mixing concerns
class SATStuff:
def __init__(self, data):
self.data = data # What is data? String? List? Dict?
def check(self):
# Mixing parsing, validation, and solving
return eval(self.data) # Dangerous and unclear
The Evolution: From DPLL to CDCL
The DPLL algorithm, developed in the 1960s, was the foundation for modern SAT solving. It worked by:
- Unit propagation: If a clause has only one unassigned literal, assign it to satisfy that clause
- Pure literal elimination: If a variable appears only in positive or negative form, assign it accordingly
- Backtracking: When a conflict occurs, backtrack to the most recent decision and try a different assignment
While effective, DPLL had a critical weakness: it threw away valuable information when backtracking. If you discovered that a particular combination of assignments led to a contradiction, DPLL would simply undo the last decision and try again—potentially making the same mistake later.
DPLL Implementation Example
# Good: Clean DPLL implementation with clear structure
class DPLLSolver:
def __init__(self, clauses: list[Clause]):
self.clauses = clauses
self.assignment = {}
def solve(self) -> tuple[bool, dict]:
"""Returns (satisfiable, assignment)"""
return self._dpll(self.clauses.copy(), {})
def _dpll(self, clauses: list[Clause], assignment: dict) -> tuple[bool, dict]:
# Check if all clauses satisfied
if all(self._is_satisfied(clause, assignment) for clause in clauses):
return True, assignment
# Check if any clause is unsatisfied
if any(self._is_unsatisfied(clause, assignment) for clause in clauses):
return False, {}
# Unit propagation
unit_clause = self._find_unit_clause(clauses, assignment)
if unit_clause:
literal = self._get_unit_literal(unit_clause, assignment)
new_assignment = assignment.copy()
new_assignment[literal.variable] = not literal.negated
return self._dpll(clauses, new_assignment)
# Pure literal elimination
pure_literal = self._find_pure_literal(clauses, assignment)
if pure_literal:
new_assignment = assignment.copy()
new_assignment[pure_literal.variable] = not pure_literal.negated
return self._dpll(clauses, new_assignment)
# Choose a variable and try both assignments
variable = self._choose_variable(clauses, assignment)
# Try true
new_assignment = assignment.copy()
new_assignment[variable] = True
satisfiable, result = self._dpll(clauses, new_assignment)
if satisfiable:
return True, result
# Try false
new_assignment = assignment.copy()
new_assignment[variable] = False
return self._dpll(clauses, new_assignment)
def _is_satisfied(self, clause: Clause, assignment: dict) -> bool:
"""Check if clause is satisfied by current assignment"""
for literal in clause.literals:
if literal.variable in assignment:
value = assignment[literal.variable]
if (value and not literal.negated) or (not value and literal.negated):
return True
return False
def _is_unsatisfied(self, clause: Clause, assignment: dict) -> bool:
"""Check if clause is unsatisfied (all literals false)"""
for literal in clause.literals:
if literal.variable not in assignment:
return False
value = assignment[literal.variable]
if (value and not literal.negated) or (not value and literal.negated):
return False
return True
# Bad: DPLL with poor structure and no clear separation
def dpll_bad(formula, vals={}):
# What is formula? What is vals?
if check_all(formula, vals): # Unclear function name
return vals
if check_none(formula, vals): # What does this check?
return None
# No clear unit propagation or pure literal logic
for v in range(100): # Magic number, unclear iteration
if v not in vals:
# Try both without clear structure
r1 = dpll_bad(formula, {**vals, v: 1})
if r1: return r1
return dpll_bad(formula, {**vals, v: 0})
The DPLL Limitation
The key problem with DPLL is illustrated here:
# DPLL behavior on conflict
def dpll_backtrack_example():
"""
Problem: (x₁ ∨ x₂) ∧ (¬x₁ ∨ x₃) ∧ (¬x₂ ∨ ¬x₃) ∧ (¬x₁ ∨ ¬x₃)
DPLL tries:
1. x₁ = true
2. x₂ = true (from unit propagation)
3. x₃ = true (from unit propagation)
4. CONFLICT! (¬x₂ ∨ ¬x₃) is violated
5. Backtrack to x₃, try x₃ = false
6. CONFLICT! (¬x₁ ∨ x₃) is violated
7. Backtrack to x₂, try x₂ = false
8. ... continues exploring
Problem: DPLL doesn't learn that x₁ = true is the root cause
"""
pass
In the 1990s, researchers realized they could dramatically improve performance by learning from conflicts. This insight led to CDCL, which has become the dominant approach in modern SAT solvers like MiniSat, Glucose, and CaDiCaL.
How CDCL Works: The Core Components
1. Boolean Constraint Propagation (BCP)
BCP is the engine that drives CDCL forward. When you assign a value to a variable, BCP automatically propagates the consequences:
# Good: Efficient BCP implementation with watch literals
class WatchedLiteralBCP:
def __init__(self, clauses: list[Clause]):
self.clauses = clauses
self.watch_lists = {} # literal -> list of clause indices
self._initialize_watches()
def _initialize_watches(self):
"""Initialize two watched literals per clause"""
for idx, clause in enumerate(self.clauses):
if len(clause.literals) >= 1:
lit1 = clause.literals[0]
self._add_watch(lit1, idx)
if len(clause.literals) >= 2:
lit2 = clause.literals[1]
self._add_watch(lit2, idx)
def _add_watch(self, literal: Literal, clause_idx: int):
"""Add a watch for a literal"""
key = (literal.variable, literal.negated)
if key not in self.watch_lists:
self.watch_lists[key] = []
self.watch_lists[key].append(clause_idx)
def propagate(self, assignment: dict, trail: list) -> tuple[bool, int]:
"""
Propagate assignments using BCP.
Returns (conflict_found, conflicting_clause_idx)
"""
propagation_queue = trail.copy()
while propagation_queue:
literal = propagation_queue.pop(0)
# Check clauses watching the negation of this literal
neg_key = (literal.variable, not literal.negated)
if neg_key not in self.watch_lists:
continue
# Process each clause watching this literal
for clause_idx in self.watch_lists[neg_key][:]:
clause = self.clauses[clause_idx]
# Try to find a new watch
new_watch = self._find_new_watch(clause, assignment)
if new_watch is None:
# Check if clause is unit or conflicting
unassigned = [lit for lit in clause.literals
if lit.variable not in assignment]
if len(unassigned) == 0:
# All literals false - conflict!
return True, clause_idx
elif len(unassigned) == 1:
# Unit clause - propagate
unit_lit = unassigned[0]
assignment[unit_lit.variable] = not unit_lit.negated
propagation_queue.append(unit_lit)
else:
# Update watch
self.watch_lists[neg_key].remove(clause_idx)
self._add_watch(new_watch, clause_idx)
return False, -1
def _find_new_watch(self, clause: Clause, assignment: dict):
"""Find a new literal to watch in the clause"""
for literal in clause.literals:
if literal.variable not in assignment:
return literal
value = assignment[literal.variable]
if (value and not literal.negated) or (not value and literal.negated):
return literal
return None
# Bad: Naive BCP that checks all clauses every time
def bad_bcp(clauses, assignment):
changed = True
while changed:
changed = False
# Inefficient: checking every clause on every iteration
for clause in clauses:
unassigned = []
satisfied = False
for lit in clause:
var = abs(lit)
if var in assignment:
val = assignment[var]
if (lit > 0 and val) or (lit < 0 and not val):
satisfied = True
break
else:
unassigned.append(lit)
if not satisfied and len(unassigned) == 1:
# Unit propagation without tracking why
lit = unassigned[0]
assignment[abs(lit)] = lit > 0
changed = True
elif not satisfied and len(unassigned) == 0:
return False # Conflict, but no information about it
return True
This propagation is crucial because it reduces the search space dramatically. Instead of manually trying every combination, BCP forces certain assignments based on logical consequences.
2. Decision Heuristics
When BCP can’t make any more forced assignments, CDCL must make a decision—choose a variable and assign it a value. The quality of this decision dramatically affects performance.
Modern CDCL solvers use sophisticated heuristics like:
- VSIDS (Variable State Independent Decaying Sum): Prioritize variables that appear frequently in recent conflicts
- Activity-based selection: Track which variables have been involved in conflicts and prefer those
- Polarity tracking: Remember which assignment (true or false) was more successful historically
# Good: VSIDS implementation with activity tracking
class VSIDSHeuristic:
def __init__(self, num_variables: int):
self.activity = {i: 0.0 for i in range(1, num_variables + 1)}
self.activity_increment = 1.0
self.decay_factor = 0.95
self.polarity_cache = {} # variable -> preferred polarity
def bump_activity(self, variable: int):
"""Increase activity score for a variable"""
self.activity[variable] += self.activity_increment
# Normalize if activities get too large
if self.activity[variable] > 1e100:
self._rescale_activities()
def decay_activities(self):
"""Decay all activities and increase increment"""
self.activity_increment /= self.decay_factor
def _rescale_activities(self):
"""Prevent overflow by rescaling all activities"""
for var in self.activity:
self.activity[var] *= 1e-100
self.activity_increment *= 1e-100
def select_variable(self, assignment: dict) -> int:
"""Select unassigned variable with highest activity"""
unassigned = [var for var in self.activity if var not in assignment]
if not unassigned:
return None
return max(unassigned, key=lambda v: self.activity[v])
def select_polarity(self, variable: int) -> bool:
"""Select polarity based on history"""
return self.polarity_cache.get(variable, False)
def save_polarity(self, variable: int, value: bool):
"""Remember successful polarity"""
self.polarity_cache[variable] = value
# Bad: Random or naive variable selection
import random
def bad_decision_heuristic(assignment, variables):
# Random selection - no learning from past conflicts
unassigned = [v for v in variables if v not in assignment]
if unassigned:
return random.choice(unassigned), random.choice([True, False])
return None, None
def bad_decision_heuristic_2(assignment, variables):
# Always picking first unassigned - no intelligence
for v in variables:
if v not in assignment:
return v, True # Always true - ignoring polarity
return None, None
Complete Decision Selection Example
# Good: Integrated decision making with multiple heuristics
class DecisionEngine:
def __init__(self, num_variables: int):
self.vsids = VSIDSHeuristic(num_variables)
self.phase_saving = {} # Variable -> last assigned value
self.decision_level = 0
def make_decision(self, assignment: dict) -> tuple[int, bool, int]:
"""
Make a decision and return (variable, value, decision_level).
Returns (None, None, -1) if all variables assigned.
"""
variable = self.vsids.select_variable(assignment)
if variable is None:
return None, None, -1
# Use phase saving: prefer last assigned value
if variable in self.phase_saving:
value = self.phase_saving[variable]
else:
value = self.vsids.select_polarity(variable)
self.decision_level += 1
return variable, value, self.decision_level
def record_assignment(self, variable: int, value: bool):
"""Record assignment for phase saving"""
self.phase_saving[variable] = value
def backtrack_to_level(self, level: int):
"""Backtrack to a specific decision level"""
self.decision_level = level
3. Conflict Analysis
When BCP detects a conflict—a situation where the current assignment makes the formula unsatisfiable—CDCL doesn’t just backtrack. Instead, it analyzes the conflict to understand why it happened.
The algorithm builds an implication graph showing how each assignment led to subsequent assignments through unit propagation. By analyzing this graph, CDCL can identify the root causes of the conflict.
# Good: Implication graph representation
class ImplicationNode:
def __init__(self, variable: int, value: bool, decision_level: int):
self.variable = variable
self.value = value
self.decision_level = decision_level
self.reason = None # Clause that caused this assignment (None for decisions)
self.antecedents = [] # Variables that led to this assignment
def is_decision(self) -> bool:
return self.reason is None
def __repr__(self):
level_str = f"@{self.decision_level}"
value_str = "T" if self.value else "F"
decision_str = " (decision)" if self.is_decision() else ""
return f"x{self.variable}={value_str}{level_str}{decision_str}"
class ImplicationGraph:
def __init__(self):
self.nodes = {} # variable -> ImplicationNode
self.trail = [] # Chronological order of assignments
def add_decision(self, variable: int, value: bool, level: int):
"""Add a decision node"""
node = ImplicationNode(variable, value, level)
self.nodes[variable] = node
self.trail.append(variable)
def add_implication(self, variable: int, value: bool, level: int,
reason_clause: Clause, antecedents: list[int]):
"""Add an implied assignment"""
node = ImplicationNode(variable, value, level)
node.reason = reason_clause
node.antecedents = antecedents
self.nodes[variable] = node
self.trail.append(variable)
def get_antecedents(self, variable: int) -> list[int]:
"""Get variables that caused this assignment"""
if variable not in self.nodes:
return []
return self.nodes[variable].antecedents
def backtrack_to_level(self, level: int):
"""Remove all assignments above the given level"""
while self.trail and self.nodes[self.trail[-1]].decision_level > level:
var = self.trail.pop()
del self.nodes[var]
# Bad: No implication tracking
class BadAssignmentTracker:
def __init__(self):
self.assignments = {} # Just variable -> value
def assign(self, var, val):
self.assignments[var] = val # No reason, no level, no graph
def unassign(self, var):
del self.assignments[var] # Can't analyze why conflict happened
Conflict Analysis Algorithm
The heart of CDCL is analyzing conflicts to learn useful clauses:
# Good: First UIP conflict analysis
class ConflictAnalyzer:
def __init__(self, implication_graph: ImplicationGraph):
self.graph = implication_graph
def analyze_conflict(self, conflict_clause: Clause,
current_level: int) -> tuple[Clause, int]:
"""
Analyze conflict and return (learned_clause, backjump_level).
Uses First UIP (Unique Implication Point) scheme.
"""
if current_level == 0:
return None, -1 # UNSAT
# Start with conflict clause
learned_literals = set()
for lit in conflict_clause.literals:
learned_literals.add((lit.variable, lit.negated))
# Count variables at current decision level
current_level_count = sum(
1 for var, _ in learned_literals
if self.graph.nodes[var].decision_level == current_level
)
# Resolve until we reach First UIP
trail_index = len(self.graph.trail) - 1
while current_level_count > 1:
# Get last assigned variable at current level
while trail_index >= 0:
var = self.graph.trail[trail_index]
node = self.graph.nodes[var]
trail_index -= 1
if node.decision_level == current_level and \
any(v == var for v, _ in learned_literals):
break
# Remove this variable and add its antecedents
learned_literals = {(v, n) for v, n in learned_literals if v != var}
current_level_count -= 1
# Add antecedents from reason clause
if node.reason:
for lit in node.reason.literals:
if lit.variable != var:
learned_literals.add((lit.variable, not lit.negated))
if self.graph.nodes[lit.variable].decision_level == current_level:
current_level_count += 1
# Build learned clause
learned_clause = Clause([
Literal(var, neg) for var, neg in learned_literals
])
# Compute backjump level (second highest decision level)
levels = sorted(set(
self.graph.nodes[var].decision_level
for var, _ in learned_literals
), reverse=True)
backjump_level = levels[1] if len(levels) > 1 else 0
return learned_clause, backjump_level
def find_uip(self, conflict_clause: Clause, level: int) -> int:
"""Find the First Unique Implication Point"""
# A UIP is a node that all paths from the decision to the conflict pass through
# First UIP is the closest to the conflict
pass # Implemented in analyze_conflict above
# Bad: No proper conflict analysis
def bad_conflict_analysis(conflict_clause):
# Just return the conflict clause without analysis
return conflict_clause, 0 # Always backtrack to level 0
def bad_conflict_analysis_2(conflict_clause, assignment):
# Random clause learning without understanding the conflict
import random
vars_in_conflict = [lit.variable for lit in conflict_clause.literals]
random_subset = random.sample(vars_in_conflict, len(vars_in_conflict) // 2)
# Meaningless learned clause
return Clause([Literal(v, True) for v in random_subset]), 0
Conflict Analysis Example
Let’s walk through a concrete example:
# Example: Analyzing a real conflict
def conflict_analysis_example():
"""
Formula: (x₁ ∨ x₂) ∧ (¬x₁ ∨ x₃) ∧ (¬x₂ ∨ x₄) ∧ (¬x₃ ∨ ¬x₄)
Decision trail:
Level 0: (no decisions)
Level 1: x₁ = true (decision)
x₃ = true (implied by ¬x₁ ∨ x₃)
Level 2: x₂ = true (decision)
x₄ = true (implied by ¬x₂ ∨ x₄)
Conflict: (¬x₃ ∨ ¬x₄) is violated
Analysis:
1. Start with conflict clause: (¬x₃ ∨ ¬x₄)
2. x₄ was implied by x₂, resolve:
(¬x₃ ∨ ¬x₄) with (¬x₂ ∨ x₄) → (¬x₃ ∨ ¬x₂)
3. x₃ was implied by x₁, resolve:
(¬x₃ ∨ ¬x₂) with (¬x₁ ∨ x₃) → (¬x₁ ∨ ¬x₂)
Learned clause: (¬x₁ ∨ ¬x₂)
Meaning: "Don't assign both x₁ and x₂ to true"
Backjump level: 1 (second highest level in learned clause)
"""
graph = ImplicationGraph()
# Level 1
graph.add_decision(1, True, 1)
graph.add_implication(3, True, 1,
Clause([Literal(1, True), Literal(3, False)]),
[1])
# Level 2
graph.add_decision(2, True, 2)
graph.add_implication(4, True, 2,
Clause([Literal(2, True), Literal(4, False)]),
[2])
# Conflict
conflict_clause = Clause([Literal(3, True), Literal(4, True)])
analyzer = ConflictAnalyzer(graph)
learned, backjump = analyzer.analyze_conflict(conflict_clause, 2)
print(f"Learned clause: {learned}") # (¬x₁ ∨ ¬x₂)
print(f"Backjump to level: {backjump}") # 1
4. Clause Learning
Once CDCL understands why a conflict occurred, it learns a new clause that prevents the same situation from happening again. This learned clause is added to the clause database and becomes part of the problem constraints.
# Good: Clause database with learned clause management
class ClauseDatabase:
def __init__(self, original_clauses: list[Clause]):
self.original_clauses = original_clauses
self.learned_clauses = []
self.clause_activity = {} # Track usefulness of learned clauses
self.activity_increment = 1.0
def add_learned_clause(self, clause: Clause):
"""Add a learned clause to the database"""
self.learned_clauses.append(clause)
self.clause_activity[id(clause)] = 0.0
def bump_clause_activity(self, clause: Clause):
"""Increase activity when clause is used in conflict analysis"""
clause_id = id(clause)
if clause_id in self.clause_activity:
self.clause_activity[clause_id] += self.activity_increment
def reduce_database(self, keep_ratio: float = 0.5):
"""Remove less useful learned clauses"""
if len(self.learned_clauses) < 1000:
return # Don't reduce if database is small
# Sort by activity
sorted_clauses = sorted(
self.learned_clauses,
key=lambda c: self.clause_activity.get(id(c), 0.0),
reverse=True
)
# Keep most active clauses
keep_count = int(len(sorted_clauses) * keep_ratio)
self.learned_clauses = sorted_clauses[:keep_count]
# Clean up activity tracking
active_ids = {id(c) for c in self.learned_clauses}
self.clause_activity = {
cid: act for cid, act in self.clause_activity.items()
if cid in active_ids
}
def get_all_clauses(self) -> list[Clause]:
"""Get all clauses (original + learned)"""
return self.original_clauses + self.learned_clauses
def get_learned_count(self) -> int:
return len(self.learned_clauses)
# Bad: No clause management or cleanup
class BadClauseDatabase:
def __init__(self, clauses):
self.clauses = clauses # Mix original and learned
def add_clause(self, clause):
self.clauses.append(clause) # Unbounded growth
# No tracking of usefulness
# No way to remove useless clauses
# Memory will explode on large problems
Clause Learning Strategies
Different strategies for learning clauses:
# Good: Multiple learning schemes
class ClauseLearningStrategy:
@staticmethod
def first_uip(graph: ImplicationGraph, conflict: Clause,
level: int) -> Clause:
"""
First UIP (Unique Implication Point) - most common strategy.
Learns the asserting clause at the first UIP.
"""
# Implementation shown in ConflictAnalyzer above
pass
@staticmethod
def decision_level(graph: ImplicationGraph, conflict: Clause,
level: int) -> Clause:
"""
Decision-level learning: learn clause with all decision variables.
More general but potentially weaker.
"""
decision_vars = []
for var in graph.nodes:
node = graph.nodes[var]
if node.is_decision() and node.decision_level > 0:
# Add negation of decision
decision_vars.append(Literal(var, not node.value))
return Clause(decision_vars)
@staticmethod
def all_uip(graph: ImplicationGraph, conflict: Clause,
level: int) -> list[Clause]:
"""
Learn clauses at all UIPs (rarely used - too many clauses).
"""
# Would return multiple learned clauses
pass
@staticmethod
def minimize_learned_clause(clause: Clause, graph: ImplicationGraph) -> Clause:
"""
Minimize learned clause by removing redundant literals.
"""
essential_literals = []
for lit in clause.literals:
# Check if this literal is implied by others
is_redundant = False
other_lits = [l for l in clause.literals if l.variable != lit.variable]
# If all paths to this literal go through other literals in clause,
# it's redundant
if not ClauseLearningStrategy._is_essential(lit, other_lits, graph):
is_redundant = True
if not is_redundant:
essential_literals.append(lit)
return Clause(essential_literals)
@staticmethod
def _is_essential(literal: Literal, other_literals: list[Literal],
graph: ImplicationGraph) -> bool:
"""Check if literal is essential (not implied by others)"""
# Simplified check
return True # Full implementation would do reachability analysis
The learned clause is typically a generalization of the conflict, capturing the essential constraint that led to the problem.
Asserting Clauses
A key property of learned clauses in CDCL:
# Good: Ensuring learned clauses are asserting
class AssertingClauseValidator:
@staticmethod
def is_asserting(clause: Clause, graph: ImplicationGraph,
backjump_level: int) -> bool:
"""
An asserting clause has exactly one literal at the current level
and becomes unit after backjumping.
"""
current_level = max(
graph.nodes[var].decision_level
for var in graph.nodes
)
literals_at_current = [
lit for lit in clause.literals
if graph.nodes[lit.variable].decision_level == current_level
]
# Should have exactly one literal at current level
if len(literals_at_current) != 1:
return False
# After backjumping, this clause should be unit
literals_above_backjump = [
lit for lit in clause.literals
if graph.nodes[lit.variable].decision_level > backjump_level
]
return len(literals_above_backjump) == 1
@staticmethod
def make_asserting(clause: Clause, graph: ImplicationGraph) -> Clause:
"""Ensure clause is asserting by proper literal ordering"""
# Put the asserting literal (highest level) first
sorted_literals = sorted(
clause.literals,
key=lambda lit: graph.nodes[lit.variable].decision_level,
reverse=True
)
return Clause(sorted_literals)
# Bad: Learning non-asserting clauses
def bad_clause_learning(conflict):
# Just negate all literals in conflict
learned = Clause([
Literal(lit.variable, not lit.negated)
for lit in conflict.literals
])
# This might not be asserting!
# Might not help with backjumping!
return learned
5. Non-Chronological Backjumping
Instead of backtracking one decision at a time (chronological backtracking), CDCL uses the learned clause to determine how far back to jump. This can skip multiple decision levels, dramatically reducing the search space.
# Good: Non-chronological backjumping implementation
class BackjumpEngine:
def __init__(self, graph: ImplicationGraph, decision_engine: DecisionEngine):
self.graph = graph
self.decision_engine = decision_engine
def backjump(self, learned_clause: Clause, backjump_level: int,
assignment: dict) -> int:
"""
Backjump to the appropriate level and apply the learned clause.
Returns the new decision level.
"""
# Remove all assignments above backjump level
self.graph.backtrack_to_level(backjump_level)
self.decision_engine.backtrack_to_level(backjump_level)
# Remove assignments from assignment dict
vars_to_remove = [
var for var in assignment
if var not in self.graph.nodes
]
for var in vars_to_remove:
del assignment[var]
# The learned clause should now be unit - propagate it
unassigned_literals = [
lit for lit in learned_clause.literals
if lit.variable not in assignment
]
if len(unassigned_literals) == 1:
# Unit clause - make the forced assignment
lit = unassigned_literals[0]
assignment[lit.variable] = not lit.negated
self.graph.add_implication(
lit.variable, not lit.negated, backjump_level,
learned_clause,
[l.variable for l in learned_clause.literals if l.variable != lit.variable]
)
return backjump_level
def compute_backjump_level(self, learned_clause: Clause) -> int:
"""
Compute the backjump level from the learned clause.
This is the second-highest decision level in the clause.
"""
if not learned_clause.literals:
return 0
levels = sorted(set(
self.graph.nodes[lit.variable].decision_level
for lit in learned_clause.literals
if lit.variable in self.graph.nodes
), reverse=True)
if len(levels) <= 1:
return 0
return levels[1] # Second highest level
# Bad: Chronological backtracking (like DPLL)
def bad_backtrack(assignment, trail):
# Just undo last decision
if trail:
last_var = trail.pop()
del assignment[last_var]
# Doesn't use learned clause information
# Might backtrack through many levels unnecessarily
return len(trail)
Backjumping Example
# Example: Demonstrating backjumping power
def backjumping_example():
"""
Decision stack:
Level 0: (initial)
Level 1: x₁ = true (decision)
Level 2: x₂ = true (decision)
Level 3: x₃ = true (decision)
Level 4: x₄ = true (decision)
x₅ = true (implied)
CONFLICT!
Learned clause: (¬x₁ ∨ ¬x₄)
Chronological backtracking would:
- Backtrack to level 3, try x₄ = false
- Still conflict
- Backtrack to level 2, try x₃ = false
- Still conflict
- Finally backtrack to level 1
Non-chronological backjumping:
- Analyze learned clause (¬x₁ ∨ ¬x₄)
- Highest level: 4 (x₄)
- Second highest: 1 (x₁)
- Jump directly to level 1!
- Skip levels 2 and 3 entirely
"""
print("Chronological: 4 → 3 → 2 → 1 (3 backtracks)")
print("Non-chronological: 4 → 1 (1 backjump)")
print("Savings: 2 unnecessary explorations avoided")
Comparison: Chronological vs Non-Chronological
# Good: Non-chronological backjumping
class CDCLBacktracking:
def handle_conflict(self, conflict_clause, current_level):
# Analyze conflict
learned_clause, backjump_level = self.analyze_conflict(
conflict_clause, current_level
)
# Add learned clause
self.clause_db.add_learned_clause(learned_clause)
# Backjump to computed level (might skip many levels)
self.backjump(backjump_level)
return backjump_level
# Bad: Chronological backtracking
class DPLLBacktracking:
def handle_conflict(self, current_level):
# Just go back one level
self.backtrack(current_level - 1)
# No learning, no intelligent jumping
return current_level - 1
Why CDCL Matters
The impact of CDCL on SAT solving has been profound:
Performance: CDCL solvers are orders of magnitude faster than DPLL on industrial problems. Problems that would take years to solve with DPLL can be solved in seconds with CDCL.
Practical Applications: CDCL enables real-world applications:
- Hardware verification: Checking that chip designs meet specifications
- Software testing: Automated test case generation
- Planning and scheduling: Finding valid schedules for complex constraints
- Cryptanalysis: Breaking certain cryptographic systems
- Package dependency resolution: Solving complex dependency constraints
- Configuration management: Finding valid system configurations
Theoretical Insights: CDCL has led to deeper understanding of SAT solving complexity and the structure of satisfiable formulas.
Real-World Application Examples
# Good: Using CDCL for package dependency resolution
class PackageDependencySolver:
def __init__(self):
self.packages = {}
self.var_to_package = {}
self.next_var = 1
def add_package(self, name: str, version: str,
dependencies: list[tuple[str, str]],
conflicts: list[tuple[str, str]]):
"""Add a package with its dependencies and conflicts"""
pkg_id = f"{name}@{version}"
var = self.next_var
self.next_var += 1
self.packages[pkg_id] = {
'variable': var,
'dependencies': dependencies,
'conflicts': conflicts
}
self.var_to_package[var] = pkg_id
def generate_sat_formula(self, required: list[str]) -> list[Clause]:
"""Generate SAT formula for dependency resolution"""
clauses = []
# Required packages must be installed
for pkg in required:
if pkg in self.packages:
var = self.packages[pkg]['variable']
clauses.append(Clause([Literal(var, False)]))
# Dependency constraints
for pkg_id, pkg_info in self.packages.items():
pkg_var = pkg_info['variable']
# If package is installed, at least one version of each dependency must be installed
for dep_name, dep_version in pkg_info['dependencies']:
dep_id = f"{dep_name}@{dep_version}"
if dep_id in self.packages:
dep_var = self.packages[dep_id]['variable']
# pkg → dep (equivalent to ¬pkg ∨ dep)
clauses.append(Clause([
Literal(pkg_var, True),
Literal(dep_var, False)
]))
# Conflict constraints
for conf_name, conf_version in pkg_info['conflicts']:
conf_id = f"{conf_name}@{conf_version}"
if conf_id in self.packages:
conf_var = self.packages[conf_id]['variable']
# ¬pkg ∨ ¬conf (can't install both)
clauses.append(Clause([
Literal(pkg_var, True),
Literal(conf_var, True)
]))
return clauses
def solve_dependencies(self, required: list[str]) -> dict:
"""Solve dependency constraints and return installation plan"""
clauses = self.generate_sat_formula(required)
solver = CDCLSolver(clauses, self.next_var - 1)
satisfiable, assignment = solver.solve()
if not satisfiable:
return None # No valid installation
# Extract installed packages
installed = []
for var, value in assignment.items():
if value and var in self.var_to_package:
installed.append(self.var_to_package[var])
return installed
# Example usage
dep_solver = PackageDependencySolver()
dep_solver.add_package("app", "1.0", [("lib", "2.0")], [])
dep_solver.add_package("lib", "2.0", [], [("lib", "1.0")])
dep_solver.add_package("lib", "1.0", [], [("lib", "2.0")])
result = dep_solver.solve_dependencies(["[email protected]"])
print(f"Install: {result}") # ['[email protected]', '[email protected]']
# Good: Using CDCL for scheduling
class SchedulingSolver:
def __init__(self, num_tasks: int, num_time_slots: int):
self.num_tasks = num_tasks
self.num_time_slots = num_time_slots
self.next_var = 1
self.task_time_vars = {} # (task, time) -> variable
# Create variables for each task-time combination
for task in range(num_tasks):
for time in range(num_time_slots):
self.task_time_vars[(task, time)] = self.next_var
self.next_var += 1
def generate_constraints(self, conflicts: list[tuple[int, int]]) -> list[Clause]:
"""Generate scheduling constraints"""
clauses = []
# Each task must be scheduled exactly once
for task in range(self.num_tasks):
# At least one time slot
at_least_one = Clause([
Literal(self.task_time_vars[(task, time)], False)
for time in range(self.num_time_slots)
])
clauses.append(at_least_one)
# At most one time slot
for time1 in range(self.num_time_slots):
for time2 in range(time1 + 1, self.num_time_slots):
at_most_one = Clause([
Literal(self.task_time_vars[(task, time1)], True),
Literal(self.task_time_vars[(task, time2)], True)
])
clauses.append(at_most_one)
# Conflict constraints: conflicting tasks can't be at same time
for task1, task2 in conflicts:
for time in range(self.num_time_slots):
conflict_clause = Clause([
Literal(self.task_time_vars[(task1, time)], True),
Literal(self.task_time_vars[(task2, time)], True)
])
clauses.append(conflict_clause)
return clauses
def solve_schedule(self, conflicts: list[tuple[int, int]]) -> dict:
"""Solve scheduling problem"""
clauses = self.generate_constraints(conflicts)
solver = CDCLSolver(clauses, self.next_var - 1)
satisfiable, assignment = solver.solve()
if not satisfiable:
return None
# Extract schedule
schedule = {}
for (task, time), var in self.task_time_vars.items():
if assignment.get(var, False):
schedule[task] = time
return schedule
# Bad: Hardcoded or brute-force approach
def bad_dependency_solver(packages, required):
# Try all combinations - exponential time
from itertools import combinations
all_packages = list(packages.keys())
for r in range(len(all_packages) + 1):
for combo in combinations(all_packages, r):
if check_valid(combo, required): # Unclear validation
return combo
return None # Very slow for large problems
The Complete CDCL Algorithm
Now let’s put all the pieces together into a complete, working CDCL solver:
# Good: Complete CDCL solver implementation
class CDCLSolver:
def __init__(self, clauses: list[Clause], num_variables: int):
self.original_clauses = clauses
self.num_variables = num_variables
# Core components
self.clause_db = ClauseDatabase(clauses)
self.graph = ImplicationGraph()
self.decision_engine = DecisionEngine(num_variables)
self.vsids = VSIDSHeuristic(num_variables)
self.bcp = WatchedLiteralBCP(clauses)
self.conflict_analyzer = ConflictAnalyzer(self.graph)
self.backjump_engine = BackjumpEngine(self.graph, self.decision_engine)
# State
self.assignment = {}
self.decision_level = 0
self.conflicts = 0
self.restarts = 0
def solve(self) -> tuple[bool, dict]:
"""
Main CDCL solving loop.
Returns (satisfiable, assignment).
"""
# Initial unit propagation
conflict, _ = self.bcp.propagate(self.assignment, [])
if conflict:
return False, {} # UNSAT at level 0
while True:
# Check if all variables assigned
if len(self.assignment) == self.num_variables:
return True, self.assignment # SAT!
# Make a decision
variable, value, level = self.decision_engine.make_decision(
self.assignment
)
if variable is None:
return True, self.assignment # SAT!
self.decision_level = level
self.assignment[variable] = value
self.graph.add_decision(variable, value, level)
self.decision_engine.record_assignment(variable, value)
# Propagate and handle conflicts
while True:
conflict, conflict_clause_idx = self.bcp.propagate(
self.assignment,
self.graph.trail
)
if not conflict:
break # No conflict, continue with next decision
# Handle conflict
self.conflicts += 1
if self.decision_level == 0:
return False, {} # UNSAT!
# Analyze conflict and learn clause
conflict_clause = self.clause_db.get_all_clauses()[conflict_clause_idx]
learned_clause, backjump_level = self.conflict_analyzer.analyze_conflict(
conflict_clause,
self.decision_level
)
# Add learned clause
self.clause_db.add_learned_clause(learned_clause)
# Update VSIDS activities
for lit in learned_clause.literals:
self.vsids.bump_activity(lit.variable)
self.vsids.decay_activities()
# Backjump
self.decision_level = self.backjump_engine.backjump(
learned_clause,
backjump_level,
self.assignment
)
# Check for restart
if self.should_restart():
self.restart()
# Periodic maintenance
if self.conflicts % 5000 == 0:
self.clause_db.reduce_database()
def should_restart(self) -> bool:
"""Determine if solver should restart"""
# Luby restart sequence or geometric
restart_threshold = 100 * (1.5 ** self.restarts)
return self.conflicts >= restart_threshold
def restart(self):
"""Restart search while keeping learned clauses"""
self.assignment = {}
self.graph = ImplicationGraph()
self.decision_level = 0
self.restarts += 1
# Keep learned clauses and VSIDS scores
# Bad: Incomplete or poorly structured solver
class BadSolver:
def solve(self, formula):
# No clear structure or components
assignment = {}
while True:
# Unclear what this does
if self.check(formula, assignment):
return assignment
# No proper BCP, conflict analysis, or learning
var = self.pick_var()
assignment[var] = True
if self.conflict(formula, assignment):
# Just backtrack without learning
del assignment[var]
assignment[var] = False
if self.conflict(formula, assignment):
# Give up too easily
return None
CDCL Algorithm Walkthrough
Let’s trace through a complete example:
# Example: Complete CDCL execution trace
def cdcl_walkthrough():
"""
Formula: (x₁ ∨ x₂) ∧ (¬x₁ ∨ x₃) ∧ (¬x₂ ∨ x₄) ∧ (¬x₃ ∨ ¬x₄) ∧ (x₃ ∨ x₄)
Execution trace:
Level 0: Initial state
- No assignments
- Run BCP: no unit clauses
Level 1: Decision
- VSIDS selects x₁ (highest activity)
- Decide x₁ = true
- BCP: (¬x₁ ∨ x₃) becomes unit → x₃ = true
- No conflict
Level 2: Decision
- VSIDS selects x₂
- Decide x₂ = true
- BCP: (¬x₂ ∨ x₄) becomes unit → x₄ = true
- CONFLICT! (¬x₃ ∨ ¬x₄) is violated
Conflict Analysis:
- Conflict clause: (¬x₃ ∨ ¬x₄)
- x₄ implied by x₂ at level 2
- x₃ implied by x₁ at level 1
- Resolve: learned clause = (¬x₁ ∨ ¬x₂)
- Backjump level: 1
Backjump to Level 1:
- Undo x₂ and x₄ assignments
- Learned clause (¬x₁ ∨ ¬x₂) is now unit
- BCP: x₂ = false (forced)
- Continue...
Level 2: Decision (after backjump)
- x₁ = true, x₂ = false, x₃ = true
- Need to assign x₄
- BCP: (x₃ ∨ x₄) is satisfied by x₃
- BCP: (¬x₂ ∨ x₄) becomes unit → x₄ = true
- Check (¬x₃ ∨ ¬x₄): violated!
- But wait, (x₃ ∨ x₄) requires one to be true
- This is actually UNSAT or needs more analysis...
"""
# Initialize solver
clauses = [
Clause([Literal(1, False), Literal(2, False)]), # (x₁ ∨ x₂)
Clause([Literal(1, True), Literal(3, False)]), # (¬x₁ ∨ x₃)
Clause([Literal(2, True), Literal(4, False)]), # (¬x₂ ∨ x₄)
Clause([Literal(3, True), Literal(4, True)]), # (¬x₃ ∨ ¬x₄)
Clause([Literal(3, False), Literal(4, False)]) # (x₃ ∨ x₄)
]
solver = CDCLSolver(clauses, 4)
satisfiable, assignment = solver.solve()
print(f"Result: {'SAT' if satisfiable else 'UNSAT'}")
if satisfiable:
print(f"Assignment: {assignment}")
Modern Enhancements
Contemporary CDCL solvers have added several refinements that dramatically improve performance:
1. Restarts
Restarts prevent the solver from getting stuck in unproductive search spaces:
# Good: Intelligent restart strategy
class RestartStrategy:
def __init__(self):
self.restart_count = 0
self.conflicts_at_restart = 0
self.base_restart_interval = 100
def should_restart(self, conflicts: int, learned_clauses: int) -> bool:
"""
Luby sequence restart strategy.
Restarts at intervals: 1, 1, 2, 1, 1, 2, 4, 1, 1, 2, 1, 1, 2, 4, 8, ...
"""
conflicts_since_restart = conflicts - self.conflicts_at_restart
threshold = self.base_restart_interval * self._luby(self.restart_count)
return conflicts_since_restart >= threshold
def _luby(self, i: int) -> int:
"""Compute i-th element of Luby sequence"""
k = 0
while (1 << (k + 1)) - 1 <= i:
k += 1
if (1 << k) - 1 == i:
return 1 << (k - 1)
else:
return self._luby(i - (1 << k) + 1)
def perform_restart(self, solver):
"""Execute restart while preserving learned information"""
self.restart_count += 1
self.conflicts_at_restart = solver.conflicts
# Clear assignment and decision stack
solver.assignment = {}
solver.graph = ImplicationGraph()
solver.decision_level = 0
# Keep learned clauses and VSIDS activities
# This is the key: we don't lose learned knowledge
# Alternative: Geometric restart
class GeometricRestart:
def __init__(self, initial: int = 100, factor: float = 1.5):
self.initial = initial
self.factor = factor
self.restart_count = 0
def should_restart(self, conflicts: int) -> bool:
threshold = self.initial * (self.factor ** self.restart_count)
return conflicts >= threshold
# Bad: No restart or random restart
class BadRestart:
def should_restart(self):
# Never restart - can get stuck
return False
class BadRestart2:
def should_restart(self):
# Random restart - loses progress
import random
return random.random() < 0.01 # 1% chance
2. Clause Deletion
Learned clauses accumulate quickly and must be managed:
# Good: Activity-based clause deletion
class ClauseDeletionStrategy:
def __init__(self, clause_db: ClauseDatabase):
self.clause_db = clause_db
self.deletion_threshold = 2000
def should_reduce(self) -> bool:
"""Check if we should delete clauses"""
return len(self.clause_db.learned_clauses) > self.deletion_threshold
def reduce_database(self):
"""Remove less useful learned clauses"""
learned = self.clause_db.learned_clauses
# Never delete clauses of size 2 (binary clauses are very useful)
binary_clauses = [c for c in learned if len(c.literals) == 2]
other_clauses = [c for c in learned if len(c.literals) > 2]
# Sort by activity (LBD score in modern solvers)
other_clauses.sort(
key=lambda c: self.clause_db.clause_activity.get(id(c), 0.0)
)
# Keep top 50% of active clauses
keep_count = len(other_clauses) // 2
kept_clauses = binary_clauses + other_clauses[:keep_count]
self.clause_db.learned_clauses = kept_clauses
# Increase threshold for next reduction
self.deletion_threshold = int(self.deletion_threshold * 1.1)
def compute_lbd(self, clause: Clause, graph: ImplicationGraph) -> int:
"""
Compute Literal Block Distance (LBD) - number of decision levels.
Lower LBD = more useful clause.
"""
levels = set()
for lit in clause.literals:
if lit.variable in graph.nodes:
levels.add(graph.nodes[lit.variable].decision_level)
return len(levels)
# Bad: No clause deletion or naive deletion
class BadClauseDeletion:
def reduce(self, clauses):
# Delete oldest clauses without considering usefulness
return clauses[-1000:] # Keep last 1000
def reduce_random(self, clauses):
# Random deletion
import random
return random.sample(clauses, len(clauses) // 2)
3. Preprocessing
Simplifying the formula before solving:
# Good: Comprehensive preprocessing
class Preprocessor:
def __init__(self, clauses: list[Clause]):
self.clauses = clauses
self.eliminated_vars = set()
def preprocess(self) -> list[Clause]:
"""Apply multiple preprocessing techniques"""
clauses = self.clauses.copy()
# Remove tautologies
clauses = self.remove_tautologies(clauses)
# Subsumption elimination
clauses = self.eliminate_subsumed(clauses)
# Variable elimination
clauses = self.eliminate_variables(clauses)
# Unit propagation at level 0
clauses, assignment = self.unit_propagation(clauses)
return clauses
def remove_tautologies(self, clauses: list[Clause]) -> list[Clause]:
"""Remove clauses like (x ∨ ¬x ∨ y) which are always true"""
result = []
for clause in clauses:
variables = {}
is_tautology = False
for lit in clause.literals:
if lit.variable in variables:
if variables[lit.variable] != lit.negated:
is_tautology = True
break
variables[lit.variable] = lit.negated
if not is_tautology:
result.append(clause)
return result
def eliminate_subsumed(self, clauses: list[Clause]) -> list[Clause]:
"""
Remove subsumed clauses.
Clause C1 subsumes C2 if all literals of C1 appear in C2.
Example: (x ∨ y) subsumes (x ∨ y ∨ z)
"""
result = []
for i, clause1 in enumerate(clauses):
is_subsumed = False
for j, clause2 in enumerate(clauses):
if i == j:
continue
# Check if clause2 subsumes clause1
lits1 = set((lit.variable, lit.negated) for lit in clause1.literals)
lits2 = set((lit.variable, lit.negated) for lit in clause2.literals)
if lits2.issubset(lits1) and len(lits2) < len(lits1):
is_subsumed = True
break
if not is_subsumed:
result.append(clause1)
return result
def eliminate_variables(self, clauses: list[Clause]) -> list[Clause]:
"""
Variable elimination by resolution.
If a variable appears in few clauses, eliminate it.
"""
# Find variables that appear in few clauses
var_occurrences = {}
for clause in clauses:
for lit in clause.literals:
if lit.variable not in var_occurrences:
var_occurrences[lit.variable] = {'pos': 0, 'neg': 0}
if lit.negated:
var_occurrences[lit.variable]['neg'] += 1
else:
var_occurrences[lit.variable]['pos'] += 1
# Eliminate variables where pos * neg < threshold
for var, counts in var_occurrences.items():
if counts['pos'] * counts['neg'] < 10: # Threshold
clauses = self._eliminate_variable(clauses, var)
self.eliminated_vars.add(var)
return clauses
def _eliminate_variable(self, clauses: list[Clause], var: int) -> list[Clause]:
"""Eliminate a variable by resolution"""
pos_clauses = [c for c in clauses if any(
lit.variable == var and not lit.negated for lit in c.literals
)]
neg_clauses = [c for c in clauses if any(
lit.variable == var and lit.negated for lit in c.literals
)]
other_clauses = [c for c in clauses if all(
lit.variable != var for lit in c.literals
)]
# Resolve all pairs
resolvents = []
for pos_clause in pos_clauses:
for neg_clause in neg_clauses:
resolvent = self._resolve(pos_clause, neg_clause, var)
if resolvent:
resolvents.append(resolvent)
return other_clauses + resolvents
def _resolve(self, clause1: Clause, clause2: Clause, var: int) -> Clause:
"""Resolve two clauses on a variable"""
lits1 = [lit for lit in clause1.literals if lit.variable != var]
lits2 = [lit for lit in clause2.literals if lit.variable != var]
return Clause(lits1 + lits2)
def unit_propagation(self, clauses: list[Clause]) -> tuple[list[Clause], dict]:
"""Propagate unit clauses at level 0"""
assignment = {}
changed = True
while changed:
changed = False
for clause in clauses:
unassigned = [lit for lit in clause.literals
if lit.variable not in assignment]
if len(unassigned) == 1:
lit = unassigned[0]
assignment[lit.variable] = not lit.negated
changed = True
# Simplify clauses with assignment
simplified = []
for clause in clauses:
satisfied = False
new_lits = []
for lit in clause.literals:
if lit.variable in assignment:
val = assignment[lit.variable]
if (val and not lit.negated) or (not val and lit.negated):
satisfied = True
break
else:
new_lits.append(lit)
if not satisfied and new_lits:
simplified.append(Clause(new_lits))
return simplified, assignment
# Bad: No preprocessing
class BadPreprocessor:
def preprocess(self, clauses):
# Just return clauses unchanged
return clauses
# Missing opportunities for simplification
4. Inprocessing
Simplification during search:
# Good: Inprocessing techniques
class Inprocessor:
def __init__(self, solver):
self.solver = solver
self.last_inprocess = 0
def should_inprocess(self) -> bool:
"""Check if we should run inprocessing"""
conflicts_since = self.solver.conflicts - self.last_inprocess
return conflicts_since > 5000
def inprocess(self):
"""Run inprocessing techniques"""
self.last_inprocess = self.solver.conflicts
# Subsumption with learned clauses
self._subsumption_check()
# Strengthen clauses
self._self_subsumption()
# Variable elimination on learned clauses
self._bounded_variable_elimination()
def _subsumption_check(self):
"""Check if learned clauses subsume original clauses"""
all_clauses = self.solver.clause_db.get_all_clauses()
to_remove = []
for i, clause1 in enumerate(all_clauses):
for j, clause2 in enumerate(all_clauses):
if i >= j:
continue
if self._subsumes(clause1, clause2):
to_remove.append(j)
# Remove subsumed clauses
for idx in sorted(to_remove, reverse=True):
if idx < len(self.solver.clause_db.original_clauses):
continue # Don't remove original clauses
learned_idx = idx - len(self.solver.clause_db.original_clauses)
del self.solver.clause_db.learned_clauses[learned_idx]
def _subsumes(self, clause1: Clause, clause2: Clause) -> bool:
"""Check if clause1 subsumes clause2"""
lits1 = set((lit.variable, lit.negated) for lit in clause1.literals)
lits2 = set((lit.variable, lit.negated) for lit in clause2.literals)
return lits1.issubset(lits2) and len(lits1) < len(lits2)
def _self_subsumption(self):
"""Strengthen clauses by self-subsumption resolution"""
# Advanced technique - simplified here
pass
def _bounded_variable_elimination(self):
"""Eliminate variables with few occurrences"""
# Similar to preprocessing but on current clause database
pass
Performance Optimization Techniques
Memory Management
# Good: Efficient memory management
class EfficientClauseStorage:
def __init__(self):
# Use arrays instead of objects for better cache locality
self.literals = [] # Flat array of literals
self.clause_starts = [] # Indices where each clause starts
self.clause_lengths = [] # Length of each clause
def add_clause(self, literals: list[int]):
"""Add clause as flat array (positive = var, negative = -var)"""
self.clause_starts.append(len(self.literals))
self.clause_lengths.append(len(literals))
self.literals.extend(literals)
def get_clause(self, idx: int) -> list[int]:
"""Retrieve clause by index"""
start = self.clause_starts[idx]
length = self.clause_lengths[idx]
return self.literals[start:start + length]
def iterate_clause(self, idx: int):
"""Memory-efficient iteration"""
start = self.clause_starts[idx]
length = self.clause_lengths[idx]
for i in range(start, start + length):
yield self.literals[i]
# Good: Watched literals for efficient BCP
class WatchedLiterals:
"""
Two-watched-literal scheme: only watch two literals per clause.
When one becomes false, find a new watch instead of checking all literals.
"""
def __init__(self):
self.watches = {} # literal -> list of clause indices
def add_watch(self, literal: int, clause_idx: int):
"""Add a watch for a literal"""
if literal not in self.watches:
self.watches[literal] = []
self.watches[literal].append(clause_idx)
def get_watches(self, literal: int) -> list[int]:
"""Get clauses watching this literal"""
return self.watches.get(literal, [])
# Bad: Inefficient memory usage
class BadClauseStorage:
def __init__(self):
# Creating objects for everything - poor cache locality
self.clauses = []
def add_clause(self, literals):
# Each literal is an object with many fields
clause = {
'id': len(self.clauses),
'literals': [{'var': abs(lit), 'neg': lit < 0,
'metadata': {}} for lit in literals],
'metadata': {},
'activity': 0.0,
'lbd': 0,
# ... many more fields
}
self.clauses.append(clause)
# Memory overhead is huge!
Algorithmic Optimizations
# Good: Lazy data structures
class LazyImplicationGraph:
"""Don't build full graph unless needed for conflict analysis"""
def __init__(self):
self.reasons = {} # variable -> reason clause
self.levels = {} # variable -> decision level
self.trail = [] # Assignment order
def add_assignment(self, var: int, level: int, reason=None):
"""Record assignment without building full graph"""
self.reasons[var] = reason
self.levels[var] = level
self.trail.append(var)
def build_graph_for_conflict(self, conflict_clause):
"""Only build graph when analyzing conflict"""
# Build minimal graph needed for this conflict
pass
# Good: Incremental SAT solving
class IncrementalSolver:
"""Reuse learned clauses across multiple related problems"""
def __init__(self):
self.solver = None
self.assumptions = []
def solve_with_assumptions(self, assumptions: list[int]) -> bool:
"""
Solve with temporary assumptions.
Learned clauses are kept for next solve.
"""
if self.solver is None:
self.solver = CDCLSolver([], 0)
# Add assumptions as unit clauses at level 0
self.assumptions = assumptions
# Solve
result, _ = self.solver.solve()
# Keep learned clauses for next call
return result
def add_clause(self, clause: Clause):
"""Add permanent clause"""
if self.solver:
self.solver.clause_db.original_clauses.append(clause)
# Bad: Rebuilding everything
class BadIncrementalSolver:
def solve(self, clauses, assumptions):
# Create new solver every time - lose all learned clauses
solver = CDCLSolver(clauses + assumptions, 100)
return solver.solve()
# All learned knowledge is lost!
Parallel CDCL
# Good: Portfolio-based parallel solving
import multiprocessing
from typing import Optional
class ParallelCDCLSolver:
def __init__(self, clauses: list[Clause], num_workers: int = 4):
self.clauses = clauses
self.num_workers = num_workers
def solve_parallel(self) -> tuple[bool, Optional[dict]]:
"""
Run multiple solvers with different configurations in parallel.
Return as soon as one finds a solution.
"""
# Create different configurations
configs = self._generate_configs()
# Use multiprocessing
with multiprocessing.Pool(self.num_workers) as pool:
results = pool.map(self._solve_with_config, configs)
# Return first SAT result, or UNSAT if all agree
for satisfiable, assignment in results:
if satisfiable:
return True, assignment
return False, None
def _generate_configs(self) -> list[dict]:
"""Generate different solver configurations"""
return [
{'restart_strategy': 'luby', 'vsids_decay': 0.95},
{'restart_strategy': 'geometric', 'vsids_decay': 0.90},
{'restart_strategy': 'luby', 'vsids_decay': 0.99},
{'restart_strategy': 'geometric', 'vsids_decay': 0.85},
]
def _solve_with_config(self, config: dict) -> tuple[bool, Optional[dict]]:
"""Solve with specific configuration"""
solver = CDCLSolver(self.clauses, self._count_variables())
# Apply configuration
# ... configure solver ...
return solver.solve()
def _count_variables(self) -> int:
"""Count unique variables in clauses"""
variables = set()
for clause in self.clauses:
for lit in clause.literals:
variables.add(lit.variable)
return len(variables)
# Good: Clause sharing between parallel solvers
class ClauseSharing:
"""Share learned clauses between parallel solvers"""
def __init__(self):
self.shared_clauses = multiprocessing.Queue()
self.clause_limit = 8 # Only share short, useful clauses
def share_clause(self, clause: Clause, lbd: int):
"""Share a learned clause if it's useful"""
if len(clause.literals) <= self.clause_limit and lbd <= 5:
self.shared_clauses.put(clause)
def import_clauses(self, solver: CDCLSolver):
"""Import shared clauses into solver"""
while not self.shared_clauses.empty():
try:
clause = self.shared_clauses.get_nowait()
solver.clause_db.add_learned_clause(clause)
except:
break
# Bad: No parallelization or naive parallelization
class BadParallelSolver:
def solve_parallel(self, clauses):
# Run same solver multiple times - no benefit
import multiprocessing
with multiprocessing.Pool(4) as pool:
results = pool.map(lambda _: self.solve(clauses), range(4))
return results[0] # All do the same work!
CDCL represents a fundamental shift in how we approach SAT solving. By learning from conflicts rather than simply backtracking, it transforms an exponential search problem into something tractable for many real-world instances. The algorithm’s elegance lies in its simplicity—the core idea is straightforward—yet its power comes from the sophisticated heuristics and engineering that modern implementations employ.
Whether you’re working on formal verification, automated reasoning, or constraint satisfaction problems, understanding CDCL gives you insight into one of computer science’s most important algorithmic breakthroughs. The next time your SAT solver solves a problem in milliseconds that would have taken hours with older algorithms, you can appreciate the conflict analysis and clause learning happening behind the scenes.
Best Practices and Common Pitfalls
Best Practices
# Good: Proper solver initialization and configuration
class WellConfiguredSolver:
@staticmethod
def create_solver(clauses: list[Clause],
problem_type: str = 'general') -> CDCLSolver:
"""Create solver with appropriate configuration"""
num_vars = max(
lit.variable for clause in clauses for lit in clause.literals
)
solver = CDCLSolver(clauses, num_vars)
# Configure based on problem type
if problem_type == 'industrial':
# Industrial problems benefit from aggressive restarts
solver.restart_strategy = GeometricRestart(initial=100, factor=1.5)
solver.vsids.decay_factor = 0.95
elif problem_type == 'random':
# Random problems need different settings
solver.restart_strategy = RestartStrategy() # Luby
solver.vsids.decay_factor = 0.90
elif problem_type == 'crafted':
# Crafted problems may need less aggressive restarts
solver.restart_strategy = GeometricRestart(initial=500, factor=1.2)
solver.vsids.decay_factor = 0.99
return solver
# Good: Proper error handling and validation
class RobustSolver:
def solve_with_validation(self, clauses: list[Clause]) -> tuple[bool, dict]:
"""Solve with input validation and error handling"""
# Validate input
if not clauses:
return True, {} # Empty formula is SAT
# Check for empty clauses (immediate UNSAT)
for clause in clauses:
if not clause.literals:
return False, {}
# Check for contradictory unit clauses
unit_clauses = [c for c in clauses if len(c.literals) == 1]
unit_vars = {}
for clause in unit_clauses:
lit = clause.literals[0]
if lit.variable in unit_vars:
if unit_vars[lit.variable] != (not lit.negated):
return False, {} # Contradiction
unit_vars[lit.variable] = not lit.negated
# Solve
try:
solver = CDCLSolver(clauses, self._count_vars(clauses))
satisfiable, assignment = solver.solve()
# Verify solution
if satisfiable:
if not self._verify_solution(clauses, assignment):
raise RuntimeError("Invalid solution produced")
return satisfiable, assignment
except Exception as e:
print(f"Error during solving: {e}")
raise
def _verify_solution(self, clauses: list[Clause],
assignment: dict) -> bool:
"""Verify that assignment satisfies all clauses"""
for clause in clauses:
satisfied = False
for lit in clause.literals:
if lit.variable in assignment:
value = assignment[lit.variable]
if (value and not lit.negated) or (not value and lit.negated):
satisfied = True
break
if not satisfied:
return False
return True
def _count_vars(self, clauses: list[Clause]) -> int:
variables = set()
for clause in clauses:
for lit in clause.literals:
variables.add(lit.variable)
return max(variables) if variables else 0
# Bad: No validation or error handling
class UnsafeSolver:
def solve(self, clauses):
# No input validation
solver = CDCLSolver(clauses, 1000) # Hardcoded variable count
result, assignment = solver.solve()
# No solution verification
return result
Common Pitfalls
# Pitfall 1: Not handling empty clauses
# Bad:
def bad_clause_handling(clauses):
solver = CDCLSolver(clauses, 100)
return solver.solve()
# If clauses contain empty clause, should return UNSAT immediately
# Good:
def good_clause_handling(clauses):
# Check for empty clauses first
if any(len(c.literals) == 0 for c in clauses):
return False, {} # UNSAT
solver = CDCLSolver(clauses, 100)
return solver.solve()
# Pitfall 2: Incorrect variable numbering
# Bad:
def bad_variable_numbering():
# Variables should start from 1, not 0
clause = Clause([Literal(0, False)]) # Variable 0 is problematic
return clause
# Good:
def good_variable_numbering():
# Variables start from 1
clause = Clause([Literal(1, False)])
return clause
# Pitfall 3: Not cleaning up learned clauses
# Bad:
class BadClauseManagement:
def solve(self, clauses):
solver = CDCLSolver(clauses, 100)
# Never delete learned clauses - memory explosion
solver.clause_db.reduce_database = lambda: None
return solver.solve()
# Good:
class GoodClauseManagement:
def solve(self, clauses):
solver = CDCLSolver(clauses, 100)
# Periodic clause deletion is enabled by default
return solver.solve()
# Pitfall 4: Ignoring polarity in learned clauses
# Bad:
def bad_learned_clause(conflict):
# Just using variables without considering polarity
vars_in_conflict = [lit.variable for lit in conflict.literals]
return Clause([Literal(v, False) for v in vars_in_conflict])
# This might not prevent the conflict!
# Good:
def good_learned_clause(conflict, graph):
# Proper conflict analysis considering polarity
analyzer = ConflictAnalyzer(graph)
learned, level = analyzer.analyze_conflict(conflict, current_level)
return learned
# Pitfall 5: Not updating VSIDS scores
# Bad:
class BadVSIDS:
def handle_conflict(self, learned_clause):
self.clause_db.add_learned_clause(learned_clause)
# Forgot to bump activities!
# VSIDS won't learn which variables are important
# Good:
class GoodVSIDS:
def handle_conflict(self, learned_clause):
self.clause_db.add_learned_clause(learned_clause)
# Update activities
for lit in learned_clause.literals:
self.vsids.bump_activity(lit.variable)
self.vsids.decay_activities()
Debugging CDCL Solvers
# Good: Comprehensive logging and debugging
class DebuggableCDCLSolver(CDCLSolver):
def __init__(self, clauses: list[Clause], num_variables: int,
debug: bool = False):
super().__init__(clauses, num_variables)
self.debug = debug
self.stats = {
'decisions': 0,
'conflicts': 0,
'propagations': 0,
'restarts': 0,
'learned_clauses': 0
}
def solve(self) -> tuple[bool, dict]:
"""Solve with optional debugging output"""
if self.debug:
print(f"Starting CDCL solver")
print(f"Variables: {self.num_variables}")
print(f"Clauses: {len(self.original_clauses)}")
result = super().solve()
if self.debug:
self._print_statistics()
return result
def _print_statistics(self):
"""Print solving statistics"""
print("\n=== Solver Statistics ===")
print(f"Decisions: {self.stats['decisions']}")
print(f"Conflicts: {self.stats['conflicts']}")
print(f"Propagations: {self.stats['propagations']}")
print(f"Restarts: {self.stats['restarts']}")
print(f"Learned clauses: {self.stats['learned_clauses']}")
print(f"Final clause DB size: {len(self.clause_db.get_all_clauses())}")
def make_decision(self, assignment: dict):
"""Override to track decisions"""
self.stats['decisions'] += 1
if self.debug:
var, val, level = self.decision_engine.make_decision(assignment)
print(f"Decision @{level}: x{var} = {val}")
return var, val, level
return super().make_decision(assignment)
def analyze_conflict(self, conflict_clause: Clause, level: int):
"""Override to track conflicts"""
self.stats['conflicts'] += 1
if self.debug:
print(f"Conflict @{level}: {conflict_clause}")
learned, backjump = super().analyze_conflict(conflict_clause, level)
if self.debug:
print(f"Learned: {learned}, backjump to {backjump}")
self.stats['learned_clauses'] += 1
return learned, backjump
Advanced Topics
Proof Generation
Modern CDCL solvers can generate proofs of unsatisfiability:
# Good: CDCL with proof generation
class ProofGeneratingSolver(CDCLSolver):
def __init__(self, clauses: list[Clause], num_variables: int):
super().__init__(clauses, num_variables)
self.proof = [] # Resolution proof
def analyze_conflict(self, conflict_clause: Clause, level: int):
"""Analyze conflict and record proof steps"""
learned, backjump = super().analyze_conflict(conflict_clause, level)
# Record resolution steps
proof_step = {
'type': 'resolution',
'conflict': conflict_clause,
'learned': learned,
'level': level
}
self.proof.append(proof_step)
return learned, backjump
def get_proof(self) -> list:
"""Get resolution proof of unsatisfiability"""
return self.proof
def verify_proof(self) -> bool:
"""Verify that proof is valid"""
# Check each resolution step
for step in self.proof:
if not self._verify_resolution_step(step):
return False
return True
def _verify_resolution_step(self, step: dict) -> bool:
"""Verify a single resolution step"""
# Verify that learned clause follows from conflict and reasons
return True # Simplified
Symmetry Breaking
Exploiting problem symmetries:
# Good: Symmetry-aware solving
class SymmetryBreakingSolver:
def __init__(self, clauses: list[Clause], symmetries: list):
self.clauses = clauses
self.symmetries = symmetries # List of variable permutations
def add_symmetry_breaking_clauses(self) -> list[Clause]:
"""Add clauses that break symmetries"""
breaking_clauses = []
for symmetry in self.symmetries:
# Add lexicographic ordering constraint
# If σ is a symmetry, add: x₁ ≤ σ(x₁), x₂ ≤ σ(x₂), ...
for var in range(1, len(symmetry) + 1):
sym_var = symmetry[var - 1]
if var != sym_var:
# x_var → x_sym_var (i.e., ¬x_var ∨ x_sym_var)
breaking_clauses.append(Clause([
Literal(var, True),
Literal(sym_var, False)
]))
return breaking_clauses
def solve(self) -> tuple[bool, dict]:
"""Solve with symmetry breaking"""
all_clauses = self.clauses + self.add_symmetry_breaking_clauses()
solver = CDCLSolver(all_clauses, self._count_vars(all_clauses))
return solver.solve()
Conclusion
CDCL represents a fundamental shift in how we approach SAT solving. By learning from conflicts rather than simply backtracking, it transforms an exponential search problem into something tractable for many real-world instances. The algorithm’s elegance lies in its simplicity—the core idea is straightforward—yet its power comes from the sophisticated heuristics and engineering that modern implementations employ.
The key innovations that make CDCL successful are:
- Conflict Analysis: Understanding why conflicts occur through implication graph analysis
- Clause Learning: Adding constraints that prevent repeating the same mistakes
- Non-Chronological Backjumping: Intelligently jumping back to relevant decision levels
- VSIDS Heuristic: Focusing on variables involved in recent conflicts
- Watched Literals: Efficient propagation through lazy data structures
- Restarts: Escaping local search space traps while preserving learned knowledge
- Clause Deletion: Managing memory by removing less useful learned clauses
Whether you’re working on formal verification, automated reasoning, constraint satisfaction, or any problem that can be encoded as SAT, understanding CDCL gives you insight into one of computer science’s most important algorithmic breakthroughs. Modern SAT solvers based on CDCL can handle formulas with millions of variables and clauses, solving problems that were completely intractable just decades ago.
The next time your SAT solver solves a problem in milliseconds that would have taken hours with older algorithms, you can appreciate the conflict analysis, clause learning, and intelligent backjumping happening behind the scenes. CDCL has truly revolutionized Boolean satisfiability solving and continues to be an active area of research and development.
External Resources
- MiniSat - A minimalistic, open-source SAT solver implementing CDCL
- Glucose - Modern CDCL solver with LBD-based clause management
- CaDiCaL - State-of-the-art CDCL solver
- SAT Competition - Annual competition showcasing latest SAT solving techniques
- Handbook of Satisfiability - Comprehensive reference on SAT solving
- DPLL and CDCL Tutorial - Interactive tutorial on SAT solving algorithms
- PySAT - Python library providing access to modern SAT solvers
Related Articles
- Boolean Satisfiability Problem (SAT): Fundamentals and Complexity
- DPLL Algorithm: The Foundation of Modern SAT Solving
- SMT Solving: Extending SAT to Richer Theories
- Formal Verification Using SAT Solvers
- Constraint Satisfaction Problems and SAT Encoding
Comments