First-order theorem proving represents one of the most powerful techniques in automated reasoning, enabling computers to verify mathematical proofs, validate software correctness, and solve complex logical problems. This comprehensive guide explores the foundations, algorithms, and practical applications of first-order theorem proving.
Understanding First-Order Logic
First-order logic (FOL), also called predicate logic, extends propositional logic by introducing quantifiers, predicates, and functions over domains of objects. Unlike propositional logic which deals only with true/false statements, FOL can express relationships between objects and make statements about collections of objects.
Basic Components of First-Order Logic
The syntax of first-order logic consists of several key elements:
# Example: Representing FOL terms in Python
class Term:
"""Base class for FOL terms"""
pass
class Variable(Term):
def __init__(self, name):
self.name = name
def __repr__(self):
return self.name
class Constant(Term):
def __init__(self, name):
self.name = name
def __repr__(self):
return self.name
class Function(Term):
def __init__(self, name, args):
self.name = name
self.args = args # List of terms
def __repr__(self):
args_str = ", ".join(str(arg) for arg in self.args)
return f"{self.name}({args_str})"
# Example usage
x = Variable("x")
john = Constant("john")
father_of = Function("father", [john])
print(f"Variable: {x}")
print(f"Constant: {john}")
print(f"Function: {father_of}")
Predicates and Formulas
Predicates express properties and relationships between objects:
class Predicate:
def __init__(self, name, args):
self.name = name
self.args = args # List of terms
def __repr__(self):
args_str = ", ".join(str(arg) for arg in self.args)
return f"{self.name}({args_str})"
class Formula:
"""Base class for FOL formulas"""
pass
class Atom(Formula):
def __init__(self, predicate):
self.predicate = predicate
def __repr__(self):
return str(self.predicate)
class Not(Formula):
def __init__(self, formula):
self.formula = formula
def __repr__(self):
return f"ยฌ{self.formula}"
class And(Formula):
def __init__(self, left, right):
self.left = left
self.right = right
def __repr__(self):
return f"({self.left} โง {self.right})"
class Or(Formula):
def __init__(self, left, right):
self.left = left
self.right = right
def __repr__(self):
return f"({self.left} โจ {self.right})"
# Example: "Human(x) โง Mortal(x)"
x = Variable("x")
human = Atom(Predicate("Human", [x]))
mortal = Atom(Predicate("Mortal", [x]))
formula = And(human, mortal)
print(f"Formula: {formula}")
Quantifiers in First-Order Logic
Quantifiers allow us to make statements about collections of objects:
class Forall(Formula):
"""Universal quantifier: โx.P(x)"""
def __init__(self, variable, formula):
self.variable = variable
self.formula = formula
def __repr__(self):
return f"โ{self.variable}.{self.formula}"
class Exists(Formula):
"""Existential quantifier: โx.P(x)"""
def __init__(self, variable, formula):
self.variable = variable
self.formula = formula
def __repr__(self):
return f"โ{self.variable}.{self.formula}"
# Example: "โx.(Human(x) โ Mortal(x))"
# "All humans are mortal"
x = Variable("x")
human = Atom(Predicate("Human", [x]))
mortal = Atom(Predicate("Mortal", [x]))
implies = Or(Not(human), mortal) # A โ B โก ยฌA โจ B
all_humans_mortal = Forall(x, implies)
print(f"Universal statement: {all_humans_mortal}")
# Example: "โx.(Human(x) โง Wise(x))"
# "There exists a wise human"
wise = Atom(Predicate("Wise", [x]))
exists_wise_human = Exists(x, And(human, wise))
print(f"Existential statement: {exists_wise_human}")
Clausal Normal Form (CNF)
Before applying theorem proving algorithms, formulas must be converted to Clausal Normal Form. CNF is a conjunction of disjunctions of literals, where a literal is an atomic formula or its negation.
Converting to CNF: Step-by-Step
def eliminate_implications(formula):
"""Convert A โ B to ยฌA โจ B"""
if isinstance(formula, Implies):
return Or(Not(formula.left), formula.right)
elif isinstance(formula, And):
return And(eliminate_implications(formula.left),
eliminate_implications(formula.right))
elif isinstance(formula, Or):
return Or(eliminate_implications(formula.left),
eliminate_implications(formula.right))
elif isinstance(formula, Not):
return Not(eliminate_implications(formula.formula))
return formula
def move_negations_inward(formula):
"""Apply De Morgan's laws and double negation"""
if isinstance(formula, Not):
inner = formula.formula
if isinstance(inner, Not):
# ยฌยฌA โก A
return move_negations_inward(inner.formula)
elif isinstance(inner, And):
# ยฌ(A โง B) โก ยฌA โจ ยฌB
return Or(move_negations_inward(Not(inner.left)),
move_negations_inward(Not(inner.right)))
elif isinstance(inner, Or):
# ยฌ(A โจ B) โก ยฌA โง ยฌB
return And(move_negations_inward(Not(inner.left)),
move_negations_inward(Not(inner.right)))
elif isinstance(inner, Forall):
# ยฌโx.P(x) โก โx.ยฌP(x)
return Exists(inner.variable,
move_negations_inward(Not(inner.formula)))
elif isinstance(inner, Exists):
# ยฌโx.P(x) โก โx.ยฌP(x)
return Forall(inner.variable,
move_negations_inward(Not(inner.formula)))
elif isinstance(formula, And):
return And(move_negations_inward(formula.left),
move_negations_inward(formula.right))
elif isinstance(formula, Or):
return Or(move_negations_inward(formula.left),
move_negations_inward(formula.right))
return formula
def standardize_variables(formula, counter={'count': 0}):
"""Rename variables to avoid conflicts"""
if isinstance(formula, Forall) or isinstance(formula, Exists):
new_var = Variable(f"{formula.variable.name}_{counter['count']}")
counter['count'] += 1
new_formula = substitute(formula.formula, formula.variable, new_var)
if isinstance(formula, Forall):
return Forall(new_var, standardize_variables(new_formula, counter))
else:
return Exists(new_var, standardize_variables(new_formula, counter))
elif isinstance(formula, And):
return And(standardize_variables(formula.left, counter),
standardize_variables(formula.right, counter))
elif isinstance(formula, Or):
return Or(standardize_variables(formula.left, counter),
standardize_variables(formula.right, counter))
return formula
Skolemization: Eliminating Existential Quantifiers
class SkolemFunction:
"""Skolem function for existential elimination"""
def __init__(self, name, args):
self.name = name
self.args = args
def __repr__(self):
if not self.args:
return self.name
args_str = ", ".join(str(arg) for arg in self.args)
return f"{self.name}({args_str})"
def skolemize(formula, universal_vars=None, skolem_counter={'count': 0}):
"""Replace existential quantifiers with Skolem functions"""
if universal_vars is None:
universal_vars = []
if isinstance(formula, Forall):
new_universal = universal_vars + [formula.variable]
return Forall(formula.variable,
skolemize(formula.formula, new_universal, skolem_counter))
elif isinstance(formula, Exists):
# Create Skolem function
skolem_name = f"sk_{skolem_counter['count']}"
skolem_counter['count'] += 1
if universal_vars:
skolem_term = Function(skolem_name, universal_vars)
else:
skolem_term = Constant(skolem_name)
# Substitute existential variable with Skolem term
new_formula = substitute(formula.formula, formula.variable, skolem_term)
return skolemize(new_formula, universal_vars, skolem_counter)
elif isinstance(formula, And):
return And(skolemize(formula.left, universal_vars, skolem_counter),
skolemize(formula.right, universal_vars, skolem_counter))
elif isinstance(formula, Or):
return Or(skolemize(formula.left, universal_vars, skolem_counter),
skolemize(formula.right, universal_vars, skolem_counter))
return formula
# Example: Skolemization
# โx.โy.Loves(x, y) โ โx.Loves(x, sk_0(x))
x = Variable("x")
y = Variable("y")
loves = Atom(Predicate("Loves", [x, y]))
formula = Forall(x, Exists(y, loves))
skolemized = skolemize(formula)
print(f"Original: {formula}")
print(f"Skolemized: {skolemized}")
Unification: The Heart of Theorem Proving
Unification is the process of finding substitutions that make two terms identical. It’s fundamental to resolution-based theorem proving.
Most General Unifier (MGU)
class Substitution:
"""Represents a substitution mapping variables to terms"""
def __init__(self, mapping=None):
self.mapping = mapping if mapping else {}
def apply(self, term):
"""Apply substitution to a term"""
if isinstance(term, Variable):
if term.name in self.mapping:
return self.apply(self.mapping[term.name])
return term
elif isinstance(term, Constant):
return term
elif isinstance(term, Function):
new_args = [self.apply(arg) for arg in term.args]
return Function(term.name, new_args)
return term
def compose(self, other):
"""Compose two substitutions"""
new_mapping = {}
# Apply other to all values in self
for var, term in self.mapping.items():
new_mapping[var] = other.apply(term)
# Add mappings from other not in self
for var, term in other.mapping.items():
if var not in new_mapping:
new_mapping[var] = term
return Substitution(new_mapping)
def __repr__(self):
items = [f"{var}/{term}" for var, term in self.mapping.items()]
return "{" + ", ".join(items) + "}"
def occurs_check(var, term):
"""Check if variable occurs in term (prevents infinite structures)"""
if isinstance(term, Variable):
return var.name == term.name
elif isinstance(term, Function):
return any(occurs_check(var, arg) for arg in term.args)
return False
def unify(term1, term2):
"""Find most general unifier of two terms"""
return unify_helper(term1, term2, Substitution())
def unify_helper(term1, term2, subst):
"""Recursive unification with substitution"""
# Apply current substitution
term1 = subst.apply(term1)
term2 = subst.apply(term2)
# Same term
if isinstance(term1, Constant) and isinstance(term2, Constant):
if term1.name == term2.name:
return subst
return None
# Variable cases
if isinstance(term1, Variable):
if occurs_check(term1, term2):
return None if term1.name != term2.name else subst
new_subst = Substitution({term1.name: term2})
return subst.compose(new_subst)
if isinstance(term2, Variable):
if occurs_check(term2, term1):
return None
new_subst = Substitution({term2.name: term1})
return subst.compose(new_subst)
# Function cases
if isinstance(term1, Function) and isinstance(term2, Function):
if term1.name != term2.name or len(term1.args) != len(term2.args):
return None
current_subst = subst
for arg1, arg2 in zip(term1.args, term2.args):
current_subst = unify_helper(arg1, arg2, current_subst)
if current_subst is None:
return None
return current_subst
return None
# Example: Unification
x = Variable("x")
y = Variable("y")
john = Constant("john")
# Unify f(x, y) with f(john, z)
term1 = Function("f", [x, y])
term2 = Function("f", [john, Variable("z")])
mgu = unify(term1, term2)
print(f"Unify {term1} and {term2}")
print(f"MGU: {mgu}")
print(f"Result: {mgu.apply(term1)}")
Good Pattern: Efficient Unification with Occurs Check
def unify_efficient(term1, term2):
"""Efficient unification with proper occurs check"""
stack = [(term1, term2)]
subst = Substitution()
while stack:
t1, t2 = stack.pop()
t1 = subst.apply(t1)
t2 = subst.apply(t2)
if isinstance(t1, Variable):
if not isinstance(t2, Variable) or t1.name != t2.name:
if occurs_check(t1, t2):
return None
subst = subst.compose(Substitution({t1.name: t2}))
elif isinstance(t2, Variable):
if occurs_check(t2, t1):
return None
subst = subst.compose(Substitution({t2.name: t1}))
elif isinstance(t1, Constant) and isinstance(t2, Constant):
if t1.name != t2.name:
return None
elif isinstance(t1, Function) and isinstance(t2, Function):
if t1.name != t2.name or len(t1.args) != len(t2.args):
return None
stack.extend(zip(t1.args, t2.args))
else:
return None
return subst
Bad Pattern: Unification Without Occurs Check
def unify_bad(term1, term2):
"""BAD: Missing occurs check leads to infinite structures"""
if isinstance(term1, Variable):
# WRONG: No occurs check
return Substitution({term1.name: term2})
if isinstance(term2, Variable):
# WRONG: No occurs check
return Substitution({term2.name: term1})
# This can create x = f(x), leading to infinite loops
return None
Resolution: The Core Inference Rule
Resolution is the primary inference rule used in automated theorem proving. It combines two clauses that contain complementary literals to produce a new clause.
Binary Resolution
class Clause:
"""A disjunction of literals"""
def __init__(self, literals):
self.literals = frozenset(literals) # Use frozenset for hashing
def __repr__(self):
if not self.literals:
return "โก" # Empty clause (contradiction)
return " โจ ".join(str(lit) for lit in self.literals)
def __hash__(self):
return hash(self.literals)
def __eq__(self, other):
return self.literals == other.literals
class Literal:
"""An atomic formula or its negation"""
def __init__(self, predicate, positive=True):
self.predicate = predicate
self.positive = positive
def negate(self):
return Literal(self.predicate, not self.positive)
def __repr__(self):
if self.positive:
return str(self.predicate)
return f"ยฌ{self.predicate}"
def __hash__(self):
return hash((str(self.predicate), self.positive))
def __eq__(self, other):
return (str(self.predicate) == str(other.predicate) and
self.positive == other.positive)
def resolve(clause1, clause2):
"""Apply binary resolution to two clauses"""
resolvents = []
for lit1 in clause1.literals:
for lit2 in clause2.literals:
# Check if literals are complementary
if lit1.positive != lit2.positive:
# Try to unify the predicates
mgu = unify_predicates(lit1.predicate, lit2.predicate)
if mgu is not None:
# Create resolvent
new_literals = set()
# Add literals from clause1 except lit1
for lit in clause1.literals:
if lit != lit1:
new_lit = apply_substitution_to_literal(lit, mgu)
new_literals.add(new_lit)
# Add literals from clause2 except lit2
for lit in clause2.literals:
if lit != lit2:
new_lit = apply_substitution_to_literal(lit, mgu)
new_literals.add(new_lit)
resolvent = Clause(new_literals)
resolvents.append((resolvent, mgu))
return resolvents
def unify_predicates(pred1, pred2):
"""Unify two predicates"""
if pred1.name != pred2.name or len(pred1.args) != len(pred2.args):
return None
subst = Substitution()
for arg1, arg2 in zip(pred1.args, pred2.args):
subst = unify_helper(arg1, arg2, subst)
if subst is None:
return None
return subst
def apply_substitution_to_literal(literal, subst):
"""Apply substitution to a literal"""
new_args = [subst.apply(arg) for arg in literal.predicate.args]
new_pred = Predicate(literal.predicate.name, new_args)
return Literal(new_pred, literal.positive)
Resolution Refutation
def resolution_refutation(clauses, max_iterations=1000):
"""
Prove unsatisfiability using resolution refutation.
Returns True if contradiction found, False otherwise.
"""
clauses = set(clauses)
new_clauses = set()
iteration = 0
while iteration < max_iterations:
iteration += 1
clause_list = list(clauses)
# Try all pairs of clauses
for i in range(len(clause_list)):
for j in range(i + 1, len(clause_list)):
resolvents = resolve(clause_list[i], clause_list[j])
for resolvent, mgu in resolvents:
# Check for empty clause (contradiction)
if not resolvent.literals:
print(f"Contradiction found at iteration {iteration}")
return True
new_clauses.add(resolvent)
# Check if we've generated new clauses
if new_clauses.issubset(clauses):
print(f"No new clauses generated. Proof failed.")
return False
clauses.update(new_clauses)
new_clauses.clear()
print(f"Max iterations reached. Proof incomplete.")
return False
# Example: Prove "All humans are mortal, Socrates is human โข Socrates is mortal"
def example_socrates():
# Clauses in CNF:
# 1. ยฌHuman(x) โจ Mortal(x) [All humans are mortal]
# 2. Human(socrates) [Socrates is human]
# 3. ยฌMortal(socrates) [Negation of conclusion]
x = Variable("x")
socrates = Constant("socrates")
# Clause 1: ยฌHuman(x) โจ Mortal(x)
human_x = Literal(Predicate("Human", [x]), positive=False)
mortal_x = Literal(Predicate("Mortal", [x]), positive=True)
clause1 = Clause([human_x, mortal_x])
# Clause 2: Human(socrates)
human_socrates = Literal(Predicate("Human", [socrates]), positive=True)
clause2 = Clause([human_socrates])
# Clause 3: ยฌMortal(socrates)
not_mortal_socrates = Literal(Predicate("Mortal", [socrates]), positive=False)
clause3 = Clause([not_mortal_socrates])
print("Proving: All humans are mortal, Socrates is human โข Socrates is mortal")
print(f"Clause 1: {clause1}")
print(f"Clause 2: {clause2}")
print(f"Clause 3: {clause3}")
print()
result = resolution_refutation([clause1, clause2, clause3])
print(f"Proof result: {result}")
example_socrates()
Advanced Resolution Strategies
Set of Support Strategy
def resolution_with_sos(axioms, negated_goal, max_iterations=1000):
"""
Resolution with Set of Support (SOS) strategy.
Only resolve clauses where at least one is from SOS.
"""
# SOS initially contains negated goal
sos = {negated_goal}
usable = set(axioms)
iteration = 0
while iteration < max_iterations:
iteration += 1
new_sos = set()
# Resolve SOS clauses with usable clauses
for sos_clause in sos:
for usable_clause in usable:
resolvents = resolve(sos_clause, usable_clause)
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"Contradiction found at iteration {iteration}")
return True
new_sos.add(resolvent)
# Resolve SOS clauses with each other
sos_list = list(sos)
for i in range(len(sos_list)):
for j in range(i + 1, len(sos_list)):
resolvents = resolve(sos_list[i], sos_list[j])
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"Contradiction found at iteration {iteration}")
return True
new_sos.add(resolvent)
if not new_sos:
print("No new clauses generated. Proof failed.")
return False
# Move old SOS to usable, new resolvents become SOS
usable.update(sos)
sos = new_sos
print("Max iterations reached. Proof incomplete.")
return False
Unit Resolution Strategy
def unit_resolution(clauses, max_iterations=1000):
"""
Unit resolution: prioritize resolution with unit clauses.
More efficient for many problems.
"""
clauses = set(clauses)
unit_clauses = {c for c in clauses if len(c.literals) == 1}
non_unit_clauses = clauses - unit_clauses
iteration = 0
while iteration < max_iterations and unit_clauses:
iteration += 1
new_clauses = set()
# Resolve unit clauses with all clauses
for unit in list(unit_clauses):
for clause in clauses:
if clause == unit:
continue
resolvents = resolve(unit, clause)
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"Contradiction found at iteration {iteration}")
return True
new_clauses.add(resolvent)
if len(resolvent.literals) == 1:
unit_clauses.add(resolvent)
if not new_clauses:
print("No new clauses generated. Proof failed.")
return False
clauses.update(new_clauses)
print("No more unit clauses. Switching to general resolution.")
return resolution_refutation(clauses, max_iterations - iteration)
Good Pattern: Subsumption and Tautology Elimination
def subsumes(clause1, clause2):
"""Check if clause1 subsumes clause2 (clause1 is more general)"""
if len(clause1.literals) > len(clause2.literals):
return False
# Try to find a substitution that makes clause1 a subset of clause2
for lit1 in clause1.literals:
found = False
for lit2 in clause2.literals:
if lit1.positive == lit2.positive:
mgu = unify_predicates(lit1.predicate, lit2.predicate)
if mgu is not None:
found = True
break
if not found:
return False
return True
def is_tautology(clause):
"""Check if clause is a tautology (contains P and ยฌP)"""
for lit1 in clause.literals:
for lit2 in clause.literals:
if lit1.positive != lit2.positive:
if unify_predicates(lit1.predicate, lit2.predicate) is not None:
return True
return False
def simplify_clause_set(clauses):
"""Remove subsumed clauses and tautologies"""
simplified = set()
for clause in clauses:
# Skip tautologies
if is_tautology(clause):
continue
# Check if subsumed by existing clause
subsumed = False
for existing in simplified:
if subsumes(existing, clause):
subsumed = True
break
if not subsumed:
# Remove clauses subsumed by this one
simplified = {c for c in simplified if not subsumes(clause, c)}
simplified.add(clause)
return simplified
Bad Pattern: No Clause Simplification
def resolution_bad(clauses, max_iterations=1000):
"""BAD: No simplification leads to exponential clause growth"""
clauses = set(clauses)
for iteration in range(max_iterations):
clause_list = list(clauses)
for i in range(len(clause_list)):
for j in range(i + 1, len(clause_list)):
resolvents = resolve(clause_list[i], clause_list[j])
for resolvent, mgu in resolvents:
if not resolvent.literals:
return True
# WRONG: Adding all resolvents without simplification
clauses.add(resolvent)
# WRONG: Clause set grows exponentially
print(f"Iteration {iteration}: {len(clauses)} clauses")
return False
Complete Theorem Prover Implementation
Here’s a complete, production-ready theorem prover with all optimizations:
class TheoremProver:
"""Complete first-order theorem prover with resolution"""
def __init__(self, max_iterations=1000, strategy='sos'):
self.max_iterations = max_iterations
self.strategy = strategy
self.proof_trace = []
def prove(self, axioms, goal):
"""
Prove that axioms entail goal.
Returns (success, proof_trace)
"""
# Convert to CNF
cnf_axioms = [self.to_cnf(axiom) for axiom in axioms]
# Negate goal and convert to CNF
negated_goal = Not(goal)
cnf_goal = self.to_cnf(negated_goal)
# Flatten to clause set
all_clauses = []
for cnf in cnf_axioms + [cnf_goal]:
all_clauses.extend(self.extract_clauses(cnf))
# Simplify initial clause set
all_clauses = simplify_clause_set(all_clauses)
print(f"Initial clauses: {len(all_clauses)}")
for i, clause in enumerate(all_clauses, 1):
print(f" {i}. {clause}")
print()
# Choose strategy
if self.strategy == 'sos':
return self.prove_sos(all_clauses[:-1], all_clauses[-1])
elif self.strategy == 'unit':
return self.prove_unit(all_clauses)
else:
return self.prove_basic(all_clauses)
def prove_basic(self, clauses):
"""Basic resolution without special strategies"""
clauses = set(clauses)
for iteration in range(self.max_iterations):
clause_list = list(clauses)
new_clauses = set()
for i in range(len(clause_list)):
for j in range(i + 1, len(clause_list)):
resolvents = resolve(clause_list[i], clause_list[j])
for resolvent, mgu in resolvents:
self.proof_trace.append({
'iteration': iteration,
'parent1': clause_list[i],
'parent2': clause_list[j],
'resolvent': resolvent,
'mgu': mgu
})
if not resolvent.literals:
print(f"โ Proof found at iteration {iteration}")
return True, self.proof_trace
new_clauses.add(resolvent)
# Simplify new clauses
new_clauses = simplify_clause_set(new_clauses)
if new_clauses.issubset(clauses):
print(f"โ No new clauses. Proof failed.")
return False, self.proof_trace
clauses.update(new_clauses)
print(f"Iteration {iteration}: {len(clauses)} clauses")
print(f"โ Max iterations reached")
return False, self.proof_trace
def prove_sos(self, axioms, goal):
"""Prove using Set of Support strategy"""
sos = {goal}
usable = set(axioms)
for iteration in range(self.max_iterations):
new_sos = set()
# SOS with usable
for sos_clause in sos:
for usable_clause in usable:
resolvents = resolve(sos_clause, usable_clause)
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"โ Proof found at iteration {iteration}")
return True, self.proof_trace
new_sos.add(resolvent)
# SOS with SOS
sos_list = list(sos)
for i in range(len(sos_list)):
for j in range(i + 1, len(sos_list)):
resolvents = resolve(sos_list[i], sos_list[j])
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"โ Proof found at iteration {iteration}")
return True, self.proof_trace
new_sos.add(resolvent)
new_sos = simplify_clause_set(new_sos)
if not new_sos:
print(f"โ No new clauses. Proof failed.")
return False, self.proof_trace
usable.update(sos)
sos = new_sos
print(f"Iteration {iteration}: SOS={len(sos)}, Usable={len(usable)}")
print(f"โ Max iterations reached")
return False, self.proof_trace
def prove_unit(self, clauses):
"""Prove using unit resolution strategy"""
clauses = set(clauses)
unit_clauses = {c for c in clauses if len(c.literals) == 1}
for iteration in range(self.max_iterations):
if not unit_clauses:
print("No more unit clauses. Switching to basic resolution.")
return self.prove_basic(clauses)
new_clauses = set()
for unit in list(unit_clauses):
for clause in clauses:
if clause == unit:
continue
resolvents = resolve(unit, clause)
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"โ Proof found at iteration {iteration}")
return True, self.proof_trace
new_clauses.add(resolvent)
if len(resolvent.literals) == 1:
unit_clauses.add(resolvent)
new_clauses = simplify_clause_set(new_clauses)
if not new_clauses:
print(f"โ No new clauses. Proof failed.")
return False, self.proof_trace
clauses.update(new_clauses)
print(f"Iteration {iteration}: {len(clauses)} clauses, {len(unit_clauses)} units")
print(f"โ Max iterations reached")
return False, self.proof_trace
def to_cnf(self, formula):
"""Convert formula to CNF (simplified version)"""
# This is a placeholder - full implementation would include
# all CNF conversion steps
return formula
def extract_clauses(self, cnf):
"""Extract clauses from CNF formula"""
# Placeholder - would extract individual clauses
return []
# Example usage
def example_theorem_prover():
prover = TheoremProver(strategy='sos')
# Define axioms and goal
x = Variable("x")
socrates = Constant("socrates")
# Axiom: All humans are mortal
human_x = Atom(Predicate("Human", [x]))
mortal_x = Atom(Predicate("Mortal", [x]))
axiom1 = Forall(x, Or(Not(human_x), mortal_x))
# Axiom: Socrates is human
axiom2 = Atom(Predicate("Human", [socrates]))
# Goal: Socrates is mortal
goal = Atom(Predicate("Mortal", [socrates]))
success, trace = prover.prove([axiom1, axiom2], goal)
print(f"\nProof {'succeeded' if success else 'failed'}")
Practical Applications
Software Verification
class ProgramVerifier:
"""Verify program properties using theorem proving"""
def __init__(self):
self.prover = TheoremProver(strategy='unit')
def verify_precondition(self, precondition, code, postcondition):
"""
Verify that if precondition holds before code execution,
postcondition holds after.
"""
# Convert code to logical formulas (simplified)
code_axioms = self.code_to_logic(code)
# Prove: precondition โง code_axioms โ postcondition
axioms = [precondition] + code_axioms
success, trace = self.prover.prove(axioms, postcondition)
return success
def code_to_logic(self, code):
"""Convert code to logical axioms (simplified)"""
# This would analyze code and generate appropriate axioms
return []
# Example: Verify array bounds
def verify_array_access():
verifier = ProgramVerifier()
# Precondition: 0 <= i < length(array)
i = Variable("i")
array = Variable("array")
precondition = And(
Atom(Predicate("GreaterEqual", [i, Constant("0")])),
Atom(Predicate("Less", [i, Function("length", [array])]))
)
# Code: value = array[i]
code = "value = array[i]"
# Postcondition: No array bounds error
postcondition = Not(Atom(Predicate("ArrayBoundsError", [])))
result = verifier.verify_precondition(precondition, code, postcondition)
print(f"Verification result: {result}")
Database Query Optimization
class QueryOptimizer:
"""Optimize database queries using theorem proving"""
def __init__(self):
self.prover = TheoremProver(strategy='sos')
def queries_equivalent(self, query1, query2):
"""Check if two queries are logically equivalent"""
# Convert queries to FOL
fol1 = self.query_to_fol(query1)
fol2 = self.query_to_fol(query2)
# Prove query1 โ query2 and query2 โ query1
forward, _ = self.prover.prove([fol1], fol2)
backward, _ = self.prover.prove([fol2], fol1)
return forward and backward
def query_to_fol(self, query):
"""Convert SQL-like query to FOL (simplified)"""
# This would parse SQL and generate FOL representation
return None
def optimize_query(self, query):
"""Find equivalent but more efficient query"""
optimizations = [
self.push_down_selection,
self.eliminate_redundant_joins,
self.merge_projections
]
optimized = query
for optimization in optimizations:
candidate = optimization(optimized)
if self.queries_equivalent(optimized, candidate):
optimized = candidate
return optimized
def push_down_selection(self, query):
"""Push selection operations closer to data source"""
# Implementation would rewrite query tree
return query
def eliminate_redundant_joins(self, query):
"""Remove unnecessary join operations"""
return query
def merge_projections(self, query):
"""Combine multiple projection operations"""
return query
Automated Planning
class Planner:
"""Automated planning using theorem proving"""
def __init__(self):
self.prover = TheoremProver(strategy='sos')
def find_plan(self, initial_state, goal_state, actions):
"""
Find sequence of actions to reach goal from initial state.
Uses situation calculus representation.
"""
# Encode initial state
axioms = [self.encode_state(initial_state)]
# Encode action preconditions and effects
for action in actions:
axioms.extend(self.encode_action(action))
# Encode frame axioms (what doesn't change)
axioms.extend(self.frame_axioms(actions))
# Try to prove goal is reachable
success, trace = self.prover.prove(axioms, goal_state)
if success:
return self.extract_plan(trace)
return None
def encode_state(self, state):
"""Encode state as FOL formula"""
# Example: At(robot, location_a) โง Holding(robot, nothing)
return None
def encode_action(self, action):
"""Encode action preconditions and effects"""
# Example: Move(x, from, to) requires At(x, from) and causes At(x, to)
return []
def frame_axioms(self, actions):
"""Encode what doesn't change when actions execute"""
return []
def extract_plan(self, proof_trace):
"""Extract action sequence from proof"""
return []
# Example: Blocks world planning
def blocks_world_example():
planner = Planner()
# Initial: Block A on table, Block B on A
initial = And(
Atom(Predicate("On", [Constant("A"), Constant("table")])),
Atom(Predicate("On", [Constant("B"), Constant("A")])),
Atom(Predicate("Clear", [Constant("B")]))
)
# Goal: Block B on table, Block A on B
goal = And(
Atom(Predicate("On", [Constant("B"), Constant("table")])),
Atom(Predicate("On", [Constant("A"), Constant("B")])),
Atom(Predicate("Clear", [Constant("A")]))
)
# Actions: pickup, putdown, stack, unstack
actions = ["pickup", "putdown", "stack", "unstack"]
plan = planner.find_plan(initial, goal, actions)
if plan:
print(f"Plan found: {plan}")
else:
print("No plan exists")
Performance Optimization Techniques
Indexing for Fast Clause Retrieval
class ClauseIndex:
"""Index clauses for efficient retrieval during resolution"""
def __init__(self):
self.positive_index = {} # predicate_name -> set of clauses
self.negative_index = {} # predicate_name -> set of clauses
def add_clause(self, clause):
"""Add clause to index"""
for literal in clause.literals:
pred_name = literal.predicate.name
if literal.positive:
if pred_name not in self.positive_index:
self.positive_index[pred_name] = set()
self.positive_index[pred_name].add(clause)
else:
if pred_name not in self.negative_index:
self.negative_index[pred_name] = set()
self.negative_index[pred_name].add(clause)
def get_candidates(self, clause):
"""Get clauses that might resolve with given clause"""
candidates = set()
for literal in clause.literals:
pred_name = literal.predicate.name
# Get clauses with complementary literals
if literal.positive:
if pred_name in self.negative_index:
candidates.update(self.negative_index[pred_name])
else:
if pred_name in self.positive_index:
candidates.update(self.positive_index[pred_name])
return candidates
class OptimizedProver(TheoremProver):
"""Theorem prover with indexing optimization"""
def __init__(self, max_iterations=1000, strategy='sos'):
super().__init__(max_iterations, strategy)
self.index = ClauseIndex()
def prove_basic(self, clauses):
"""Optimized resolution with indexing"""
clauses = set(clauses)
# Build initial index
for clause in clauses:
self.index.add_clause(clause)
for iteration in range(self.max_iterations):
new_clauses = set()
for clause in clauses:
# Only check candidates that might resolve
candidates = self.index.get_candidates(clause)
for candidate in candidates:
resolvents = resolve(clause, candidate)
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"โ Proof found at iteration {iteration}")
return True, self.proof_trace
new_clauses.add(resolvent)
new_clauses = simplify_clause_set(new_clauses)
if new_clauses.issubset(clauses):
print(f"โ No new clauses. Proof failed.")
return False, self.proof_trace
# Update index with new clauses
for clause in new_clauses:
self.index.add_clause(clause)
clauses.update(new_clauses)
print(f"Iteration {iteration}: {len(clauses)} clauses")
print(f"โ Max iterations reached")
return False, self.proof_trace
Good Pattern: Clause Weight Heuristics
class WeightedProver(TheoremProver):
"""Prover that prioritizes lighter clauses"""
def clause_weight(self, clause):
"""Calculate clause weight (prefer smaller clauses)"""
weight = 0
for literal in clause.literals:
weight += 1 # Base weight per literal
weight += self.term_weight(literal.predicate)
return weight
def term_weight(self, term):
"""Calculate term complexity"""
if isinstance(term, Variable):
return 1
elif isinstance(term, Constant):
return 1
elif isinstance(term, Function):
return 1 + sum(self.term_weight(arg) for arg in term.args)
elif isinstance(term, Predicate):
return sum(self.term_weight(arg) for arg in term.args)
return 0
def prove_basic(self, clauses):
"""Resolution with weight-based clause selection"""
clauses = set(clauses)
for iteration in range(self.max_iterations):
# Sort clauses by weight
sorted_clauses = sorted(clauses, key=self.clause_weight)
new_clauses = set()
# Prioritize lighter clauses
for i, clause1 in enumerate(sorted_clauses[:100]): # Limit to top 100
for clause2 in sorted_clauses[i+1:200]:
resolvents = resolve(clause1, clause2)
for resolvent, mgu in resolvents:
if not resolvent.literals:
print(f"โ Proof found at iteration {iteration}")
return True, self.proof_trace
new_clauses.add(resolvent)
new_clauses = simplify_clause_set(new_clauses)
if new_clauses.issubset(clauses):
print(f"โ No new clauses. Proof failed.")
return False, self.proof_trace
clauses.update(new_clauses)
print(f"Iteration {iteration}: {len(clauses)} clauses")
print(f"โ Max iterations reached")
return False, self.proof_trace
Bad Pattern: No Heuristics or Ordering
def prove_unoptimized(clauses, max_iterations=1000):
"""BAD: No heuristics leads to poor performance"""
clauses = list(clauses) # WRONG: Using list instead of set
for iteration in range(max_iterations):
# WRONG: Trying all pairs without any ordering
for i in range(len(clauses)):
for j in range(len(clauses)):
if i == j:
continue
resolvents = resolve(clauses[i], clauses[j])
for resolvent, mgu in resolvents:
if not resolvent.literals:
return True
# WRONG: Appending duplicates
clauses.append(resolvent)
# WRONG: Exponential growth, no pruning
print(f"Iteration {iteration}: {len(clauses)} clauses")
return False
Integration with Modern Tools
Using Z3 SMT Solver
try:
from z3 import *
class Z3TheoremProver:
"""Wrapper around Z3 SMT solver for theorem proving"""
def __init__(self):
self.solver = Solver()
def prove(self, axioms, goal):
"""Prove goal from axioms using Z3"""
# Add axioms
for axiom in axioms:
z3_axiom = self.to_z3(axiom)
self.solver.add(z3_axiom)
# Add negation of goal
z3_goal = self.to_z3(goal)
self.solver.add(Not(z3_goal))
# Check satisfiability
result = self.solver.check()
if result == unsat:
print("โ Proof successful (unsatisfiable)")
return True
elif result == sat:
print("โ Proof failed (satisfiable)")
print(f"Counterexample: {self.solver.model()}")
return False
else:
print("? Unknown result")
return None
def to_z3(self, formula):
"""Convert FOL formula to Z3 expression"""
# Simplified conversion
if isinstance(formula, Atom):
return Bool(str(formula.predicate))
elif isinstance(formula, Not):
return Not(self.to_z3(formula.formula))
elif isinstance(formula, And):
return And(self.to_z3(formula.left), self.to_z3(formula.right))
elif isinstance(formula, Or):
return Or(self.to_z3(formula.left), self.to_z3(formula.right))
return Bool('unknown')
# Example usage
def example_z3():
prover = Z3TheoremProver()
# Define problem
x = Variable("x")
p = Atom(Predicate("P", [x]))
q = Atom(Predicate("Q", [x]))
# Axiom: P(x) โ Q(x)
axiom = Or(Not(p), q)
# Goal: P(a) โง (P(x) โ Q(x)) โข Q(a)
result = prover.prove([axiom], q)
print(f"Z3 result: {result}")
except ImportError:
print("Z3 not available. Install with: pip install z3-solver")
Integration with Prover9/Mace4
import subprocess
import tempfile
import os
class Prover9Interface:
"""Interface to Prover9 automated theorem prover"""
def __init__(self, prover9_path='prover9', timeout=60):
self.prover9_path = prover9_path
self.timeout = timeout
def prove(self, axioms, goal):
"""Prove goal from axioms using Prover9"""
# Create input file
input_text = self.format_input(axioms, goal)
# Write to temporary file
with tempfile.NamedTemporaryFile(mode='w', suffix='.in', delete=False) as f:
f.write(input_text)
input_file = f.name
try:
# Run Prover9
result = subprocess.run(
[self.prover9_path, '-f', input_file],
capture_output=True,
text=True,
timeout=self.timeout
)
# Parse output
output = result.stdout
if 'THEOREM PROVED' in output:
print("โ Prover9: Theorem proved")
return True, self.extract_proof(output)
elif 'SEARCH FAILED' in output:
print("โ Prover9: Search failed")
return False, None
else:
print("? Prover9: Unknown result")
return None, None
except subprocess.TimeoutExpired:
print(f"โ Prover9: Timeout after {self.timeout}s")
return None, None
finally:
os.unlink(input_file)
def format_input(self, axioms, goal):
"""Format problem in Prover9 syntax"""
lines = ["formulas(assumptions)."]
for axiom in axioms:
lines.append(f" {self.to_prover9(axiom)}.")
lines.append("end_of_list.")
lines.append("")
lines.append("formulas(goals).")
lines.append(f" {self.to_prover9(goal)}.")
lines.append("end_of_list.")
return "\n".join(lines)
def to_prover9(self, formula):
"""Convert FOL formula to Prover9 syntax"""
if isinstance(formula, Atom):
pred = formula.predicate
if not pred.args:
return pred.name
args = ", ".join(self.term_to_prover9(arg) for arg in pred.args)
return f"{pred.name}({args})"
elif isinstance(formula, Not):
return f"-({self.to_prover9(formula.formula)})"
elif isinstance(formula, And):
left = self.to_prover9(formula.left)
right = self.to_prover9(formula.right)
return f"({left} & {right})"
elif isinstance(formula, Or):
left = self.to_prover9(formula.left)
right = self.to_prover9(formula.right)
return f"({left} | {right})"
elif isinstance(formula, Forall):
var = formula.variable.name
body = self.to_prover9(formula.formula)
return f"all {var} ({body})"
elif isinstance(formula, Exists):
var = formula.variable.name
body = self.to_prover9(formula.formula)
return f"exists {var} ({body})"
return "unknown"
def term_to_prover9(self, term):
"""Convert term to Prover9 syntax"""
if isinstance(term, Variable):
return term.name
elif isinstance(term, Constant):
return term.name
elif isinstance(term, Function):
if not term.args:
return term.name
args = ", ".join(self.term_to_prover9(arg) for arg in term.args)
return f"{term.name}({args})"
return "unknown"
def extract_proof(self, output):
"""Extract proof steps from Prover9 output"""
proof_lines = []
in_proof = False
for line in output.split('\n'):
if '====== PROOF ======' in line:
in_proof = True
elif '====== end of proof ======' in line:
break
elif in_proof:
proof_lines.append(line.strip())
return proof_lines
# Example usage
def example_prover9():
prover = Prover9Interface()
# Socrates example
x = Variable("x")
socrates = Constant("socrates")
human_x = Atom(Predicate("Human", [x]))
mortal_x = Atom(Predicate("Mortal", [x]))
human_socrates = Atom(Predicate("Human", [socrates]))
mortal_socrates = Atom(Predicate("Mortal", [socrates]))
axiom1 = Forall(x, Or(Not(human_x), mortal_x))
axiom2 = human_socrates
goal = mortal_socrates
success, proof = prover.prove([axiom1, axiom2], goal)
if success and proof:
print("\nProof steps:")
for step in proof:
print(f" {step}")
Best Practices and Common Pitfalls
Good Pattern: Proper Variable Scoping
class VariableManager:
"""Manage variable scoping and renaming"""
def __init__(self):
self.counter = 0
self.scope_stack = [{}]
def fresh_variable(self, base_name="x"):
"""Generate fresh variable name"""
var_name = f"{base_name}_{self.counter}"
self.counter += 1
return Variable(var_name)
def enter_scope(self):
"""Enter new variable scope"""
self.scope_stack.append({})
def exit_scope(self):
"""Exit current variable scope"""
if len(self.scope_stack) > 1:
self.scope_stack.pop()
def bind_variable(self, var_name, value):
"""Bind variable in current scope"""
self.scope_stack[-1][var_name] = value
def lookup_variable(self, var_name):
"""Look up variable in scope stack"""
for scope in reversed(self.scope_stack):
if var_name in scope:
return scope[var_name]
return None
# Good usage
def good_variable_handling():
vm = VariableManager()
# Create fresh variables for each quantifier
vm.enter_scope()
x1 = vm.fresh_variable("x")
formula1 = Forall(x1, Atom(Predicate("P", [x1])))
vm.exit_scope()
vm.enter_scope()
x2 = vm.fresh_variable("x")
formula2 = Forall(x2, Atom(Predicate("Q", [x2])))
vm.exit_scope()
# x1 and x2 are distinct, no conflicts
print(f"Formula 1: {formula1}")
print(f"Formula 2: {formula2}")
Bad Pattern: Variable Name Conflicts
def bad_variable_handling():
"""BAD: Reusing variable names causes conflicts"""
x = Variable("x")
# WRONG: Same variable used in different contexts
formula1 = Forall(x, Atom(Predicate("P", [x])))
formula2 = Forall(x, Atom(Predicate("Q", [x])))
# WRONG: Combining formulas with same variable
combined = And(formula1, formula2)
# This creates unintended variable capture
print(f"Combined: {combined}")
Good Pattern: Incremental Proof Construction
class IncrementalProver:
"""Build proofs incrementally with checkpointing"""
def __init__(self):
self.prover = TheoremProver(strategy='sos')
self.axiom_stack = []
self.checkpoints = []
def add_axiom(self, axiom):
"""Add axiom to knowledge base"""
self.axiom_stack.append(axiom)
def checkpoint(self):
"""Save current state"""
self.checkpoints.append(len(self.axiom_stack))
def rollback(self):
"""Rollback to last checkpoint"""
if self.checkpoints:
checkpoint = self.checkpoints.pop()
self.axiom_stack = self.axiom_stack[:checkpoint]
def prove_incremental(self, goal):
"""Prove goal with current axioms"""
return self.prover.prove(self.axiom_stack, goal)
def prove_with_assumption(self, assumption, goal):
"""Prove goal assuming additional axiom"""
self.checkpoint()
self.add_axiom(assumption)
result = self.prove_incremental(goal)
self.rollback()
return result
# Example usage
def example_incremental():
prover = IncrementalProver()
# Add base axioms
x = Variable("x")
prover.add_axiom(Forall(x, Atom(Predicate("Human", [x]))))
# Prove with additional assumption
socrates = Constant("socrates")
assumption = Atom(Predicate("Human", [socrates]))
goal = Atom(Predicate("Mortal", [socrates]))
result = prover.prove_with_assumption(assumption, goal)
print(f"Proof with assumption: {result}")
Bad Pattern: Monolithic Proof Attempts
def bad_proof_attempt(all_axioms, goal):
"""BAD: No incremental building or checkpointing"""
prover = TheoremProver()
# WRONG: Trying to prove everything at once
# No way to identify which axioms are needed
# No way to debug failed proofs
success, trace = prover.prove(all_axioms, goal)
if not success:
# WRONG: No information about why proof failed
print("Proof failed, no idea why")
return success
Debugging and Visualization
Proof Tree Visualization
class ProofVisualizer:
"""Visualize proof trees and resolution steps"""
def __init__(self):
self.nodes = []
self.edges = []
def visualize_proof(self, proof_trace):
"""Create visual representation of proof"""
for i, step in enumerate(proof_trace):
parent1 = step['parent1']
parent2 = step['parent2']
resolvent = step['resolvent']
mgu = step['mgu']
print(f"\nStep {i + 1}:")
print(f" Parent 1: {parent1}")
print(f" Parent 2: {parent2}")
print(f" MGU: {mgu}")
print(f" Resolvent: {resolvent}")
if not resolvent.literals:
print(f" โ Empty clause (contradiction)")
def export_dot(self, proof_trace, filename='proof.dot'):
"""Export proof as Graphviz DOT file"""
lines = ['digraph Proof {']
lines.append(' rankdir=TB;')
lines.append(' node [shape=box];')
for i, step in enumerate(proof_trace):
parent1_id = f"p1_{i}"
parent2_id = f"p2_{i}"
resolvent_id = f"r_{i}"
lines.append(f' {parent1_id} [label="{step["parent1"]}"];')
lines.append(f' {parent2_id} [label="{step["parent2"]}"];')
lines.append(f' {resolvent_id} [label="{step["resolvent"]}"];')
lines.append(f' {parent1_id} -> {resolvent_id};')
lines.append(f' {parent2_id} -> {resolvent_id};')
lines.append('}')
with open(filename, 'w') as f:
f.write('\n'.join(lines))
print(f"Proof exported to {filename}")
print(f"Visualize with: dot -Tpng {filename} -o proof.png")
# Example usage
def example_visualization():
prover = TheoremProver(strategy='sos')
visualizer = ProofVisualizer()
# Run proof
x = Variable("x")
socrates = Constant("socrates")
axiom1 = Forall(x, Or(Not(Atom(Predicate("Human", [x]))),
Atom(Predicate("Mortal", [x]))))
axiom2 = Atom(Predicate("Human", [socrates]))
goal = Atom(Predicate("Mortal", [socrates]))
success, trace = prover.prove([axiom1, axiom2], goal)
if success:
visualizer.visualize_proof(trace)
visualizer.export_dot(trace)
Performance Benchmarks
Comparing Different Strategies
import time
class ProverBenchmark:
"""Benchmark different theorem proving strategies"""
def __init__(self):
self.results = []
def benchmark_strategy(self, strategy, axioms, goal, name):
"""Benchmark a single strategy"""
prover = TheoremProver(max_iterations=1000, strategy=strategy)
start_time = time.time()
success, trace = prover.prove(axioms, goal)
end_time = time.time()
elapsed = end_time - start_time
result = {
'name': name,
'strategy': strategy,
'success': success,
'time': elapsed,
'steps': len(trace) if trace else 0
}
self.results.append(result)
return result
def run_benchmarks(self, axioms, goal):
"""Run all strategy benchmarks"""
strategies = [
('basic', 'Basic Resolution'),
('sos', 'Set of Support'),
('unit', 'Unit Resolution')
]
print("Running benchmarks...")
print("=" * 60)
for strategy, name in strategies:
result = self.benchmark_strategy(strategy, axioms, goal, name)
print(f"\n{name}:")
print(f" Success: {result['success']}")
print(f" Time: {result['time']:.4f}s")
print(f" Steps: {result['steps']}")
print("\n" + "=" * 60)
self.print_summary()
def print_summary(self):
"""Print benchmark summary"""
print("\nSummary:")
print(f"{'Strategy':<20} {'Success':<10} {'Time (s)':<12} {'Steps':<10}")
print("-" * 60)
for result in sorted(self.results, key=lambda r: r['time']):
print(f"{result['name']:<20} {str(result['success']):<10} "
f"{result['time']:<12.4f} {result['steps']:<10}")
# Example usage
def example_benchmark():
benchmark = ProverBenchmark()
# Define problem
x = Variable("x")
y = Variable("y")
# Transitivity of equality
axiom1 = Forall(x, Atom(Predicate("Eq", [x, x])))
axiom2 = Forall(x, Forall(y,
Or(Not(Atom(Predicate("Eq", [x, y]))),
Atom(Predicate("Eq", [y, x])))))
a = Constant("a")
b = Constant("b")
goal = Atom(Predicate("Eq", [a, b]))
benchmark.run_benchmarks([axiom1, axiom2], goal)
Real-World Case Studies
Case Study 1: Verifying Cryptographic Protocols
class CryptoProtocolVerifier:
"""Verify security properties of cryptographic protocols"""
def __init__(self):
self.prover = TheoremProver(strategy='sos')
def verify_authentication(self, protocol):
"""Verify authentication property of protocol"""
# Model protocol as FOL axioms
axioms = self.model_protocol(protocol)
# Authentication property: if B receives message from A,
# then A actually sent it
a = Constant("alice")
b = Constant("bob")
msg = Variable("msg")
# Goal: Received(b, msg, a) โ Sent(a, msg, b)
received = Atom(Predicate("Received", [b, msg, a]))
sent = Atom(Predicate("Sent", [a, msg, b]))
goal = Or(Not(received), sent)
success, trace = self.prover.prove(axioms, goal)
if success:
print("โ Authentication property verified")
else:
print("โ Authentication property violated")
print("Potential attack found")
return success
def model_protocol(self, protocol):
"""Convert protocol to FOL axioms"""
axioms = []
# Model encryption/decryption
# Encrypt(k, m) can be decrypted with k
k = Variable("k")
m = Variable("m")
encrypted = Function("Encrypt", [k, m])
axioms.append(
Forall(k, Forall(m,
Atom(Predicate("CanDecrypt", [k, encrypted]))))
)
# Model message transmission
# If A sends encrypted message, B can receive if has key
# ... (additional protocol-specific axioms)
return axioms
def verify_secrecy(self, protocol, secret):
"""Verify that secret remains confidential"""
axioms = self.model_protocol(protocol)
# Attacker should not learn secret
attacker = Constant("attacker")
goal = Not(Atom(Predicate("Knows", [attacker, secret])))
success, trace = self.prover.prove(axioms, goal)
if success:
print("โ Secrecy property verified")
else:
print("โ Secrecy property violated")
print("Information leak detected")
return success
# Example: Needham-Schroeder protocol
def example_crypto_verification():
verifier = CryptoProtocolVerifier()
protocol = {
'name': 'Needham-Schroeder',
'steps': [
'A -> B: {Na, A}Kb',
'B -> A: {Na, Nb}Ka',
'A -> B: {Nb}Kb'
]
}
result = verifier.verify_authentication(protocol)
print(f"Verification result: {result}")
Case Study 2: Hardware Verification
class HardwareVerifier:
"""Verify hardware circuit correctness"""
def __init__(self):
self.prover = TheoremProver(strategy='unit')
def verify_circuit(self, circuit_spec, property_spec):
"""Verify that circuit satisfies property"""
# Convert circuit to FOL
circuit_axioms = self.circuit_to_fol(circuit_spec)
# Convert property to FOL
property_formula = self.property_to_fol(property_spec)
# Prove property holds
success, trace = self.prover.prove(circuit_axioms, property_formula)
return success
def circuit_to_fol(self, circuit):
"""Convert circuit specification to FOL axioms"""
axioms = []
# Model gates
for gate in circuit.get('gates', []):
gate_axiom = self.model_gate(gate)
axioms.append(gate_axiom)
# Model connections
for connection in circuit.get('connections', []):
conn_axiom = self.model_connection(connection)
axioms.append(conn_axiom)
return axioms
def model_gate(self, gate):
"""Model logic gate as FOL formula"""
gate_type = gate['type']
inputs = [Variable(inp) for inp in gate['inputs']]
output = Variable(gate['output'])
if gate_type == 'AND':
# output = input1 AND input2
formula = Atom(Predicate("Equals", [
output,
Function("And", inputs)
]))
elif gate_type == 'OR':
formula = Atom(Predicate("Equals", [
output,
Function("Or", inputs)
]))
elif gate_type == 'NOT':
formula = Atom(Predicate("Equals", [
output,
Function("Not", inputs)
]))
else:
formula = None
return formula
def model_connection(self, connection):
"""Model wire connection"""
source = Variable(connection['from'])
dest = Variable(connection['to'])
return Atom(Predicate("Equals", [source, dest]))
def property_to_fol(self, property_spec):
"""Convert property specification to FOL"""
# Example: "output is always 1 when both inputs are 1"
return None
# Example: Verify full adder
def example_hardware_verification():
verifier = HardwareVerifier()
full_adder = {
'gates': [
{'type': 'XOR', 'inputs': ['a', 'b'], 'output': 'sum1'},
{'type': 'XOR', 'inputs': ['sum1', 'cin'], 'output': 'sum'},
{'type': 'AND', 'inputs': ['a', 'b'], 'output': 'carry1'},
{'type': 'AND', 'inputs': ['sum1', 'cin'], 'output': 'carry2'},
{'type': 'OR', 'inputs': ['carry1', 'carry2'], 'output': 'cout'}
],
'connections': []
}
property_spec = {
'type': 'correctness',
'description': 'sum and carry are correct'
}
result = verifier.verify_circuit(full_adder, property_spec)
print(f"Full adder verification: {result}")
Case Study 3: Semantic Web Reasoning
class SemanticWebReasoner:
"""Reason over RDF/OWL ontologies using FOL"""
def __init__(self):
self.prover = TheoremProver(strategy='sos')
self.ontology = []
def load_ontology(self, ontology_axioms):
"""Load ontology axioms"""
self.ontology = ontology_axioms
def query(self, query_formula):
"""Query ontology"""
success, trace = self.prover.prove(self.ontology, query_formula)
return success
def infer_subclass(self, class_a, class_b):
"""Infer if class_a is subclass of class_b"""
x = Variable("x")
# Query: โx. class_a(x) โ class_b(x)
instance_a = Atom(Predicate(class_a, [x]))
instance_b = Atom(Predicate(class_b, [x]))
query = Forall(x, Or(Not(instance_a), instance_b))
return self.query(query)
def infer_property(self, subject, property_name, object_value):
"""Infer if subject has property with object"""
query = Atom(Predicate(property_name, [
Constant(subject),
Constant(object_value)
]))
return self.query(query)
# Example: Animal taxonomy reasoning
def example_semantic_web():
reasoner = SemanticWebReasoner()
# Define ontology
x = Variable("x")
# Axiom: All dogs are mammals
dog = Atom(Predicate("Dog", [x]))
mammal = Atom(Predicate("Mammal", [x]))
axiom1 = Forall(x, Or(Not(dog), mammal))
# Axiom: All mammals are animals
animal = Atom(Predicate("Animal", [x]))
axiom2 = Forall(x, Or(Not(mammal), animal))
# Axiom: Fido is a dog
fido = Constant("fido")
axiom3 = Atom(Predicate("Dog", [fido]))
reasoner.load_ontology([axiom1, axiom2, axiom3])
# Query: Is Fido an animal?
result = reasoner.infer_property("fido", "Animal", "true")
print(f"Is Fido an animal? {result}")
# Query: Are dogs animals?
result = reasoner.infer_subclass("Dog", "Animal")
print(f"Are dogs animals? {result}")
Advanced Topics
Modal Logic Extensions
class ModalTheoremProver:
"""Theorem prover for modal logic (knowledge, belief, time)"""
def __init__(self):
self.prover = TheoremProver(strategy='sos')
self.worlds = set()
self.accessibility = {}
def add_world(self, world):
"""Add possible world"""
self.worlds.add(world)
def add_accessibility(self, world1, world2):
"""Add accessibility relation between worlds"""
if world1 not in self.accessibility:
self.accessibility[world1] = set()
self.accessibility[world1].add(world2)
def knows(self, agent, formula):
"""Agent knows formula (true in all accessible worlds)"""
# K(agent, ฯ) โก โw. Accessible(current, w) โ Holds(ฯ, w)
w = Variable("w")
current = Constant("current_world")
accessible = Atom(Predicate("Accessible", [current, w]))
holds = Atom(Predicate("Holds", [formula, w]))
return Forall(w, Or(Not(accessible), holds))
def believes(self, agent, formula):
"""Agent believes formula (similar to knows but weaker)"""
# Similar to knows but with belief-accessibility relation
return self.knows(agent, formula)
def eventually(self, formula):
"""Formula eventually becomes true (temporal logic)"""
# F(ฯ) โก โt. Future(current, t) โง Holds(ฯ, t)
t = Variable("t")
current = Constant("current_time")
future = Atom(Predicate("Future", [current, t]))
holds = Atom(Predicate("Holds", [formula, t]))
return Exists(t, And(future, holds))
def always(self, formula):
"""Formula always holds (temporal logic)"""
# G(ฯ) โก โt. Future(current, t) โ Holds(ฯ, t)
t = Variable("t")
current = Constant("current_time")
future = Atom(Predicate("Future", [current, t]))
holds = Atom(Predicate("Holds", [formula, t]))
return Forall(t, Or(Not(future), holds))
# Example: Knowledge reasoning
def example_modal_logic():
prover = ModalTheoremProver()
# Setup worlds
prover.add_world("w1")
prover.add_world("w2")
prover.add_accessibility("w1", "w2")
# Alice knows that Bob is tall
alice = Constant("alice")
bob = Constant("bob")
tall = Atom(Predicate("Tall", [bob]))
knowledge = prover.knows(alice, tall)
print(f"Alice knows Bob is tall: {knowledge}")
Higher-Order Logic
class HigherOrderProver:
"""Prover for higher-order logic (predicates as arguments)"""
def __init__(self):
self.prover = TheoremProver(strategy='sos')
def apply_predicate(self, pred_var, arg):
"""Apply predicate variable to argument"""
# P(x) where P is a variable
return Atom(Predicate("Apply", [pred_var, arg]))
def forall_predicates(self, pred_var, formula):
"""Quantify over predicates"""
# โP. formula(P)
return Forall(pred_var, formula)
def exists_predicate(self, pred_var, formula):
"""Existential quantification over predicates"""
# โP. formula(P)
return Exists(pred_var, formula)
# Example: Prove properties about properties
def example_higher_order():
prover = HigherOrderProver()
# โP. (โx. P(x)) โ P(a)
# "If property holds for all, it holds for specific instance"
p = Variable("P") # Predicate variable
x = Variable("x")
a = Constant("a")
p_of_x = prover.apply_predicate(p, x)
p_of_a = prover.apply_predicate(p, a)
formula = prover.forall_predicates(p,
Or(Not(Forall(x, p_of_x)), p_of_a))
print(f"Higher-order formula: {formula}")
Conclusion and Best Practices Summary
First-order theorem proving is a powerful technique with applications across computer science, from software verification to artificial intelligence. Here are the key takeaways:
- Always convert formulas to CNF before applying resolution
- Use unification with occurs check to avoid infinite structures
- Apply simplification techniques (subsumption, tautology elimination)
- Choose appropriate resolution strategies (SOS, unit resolution)
- Index clauses for efficient retrieval
- Use weight heuristics to guide search
- Implement proper variable scoping and renaming
- Build proofs incrementally with checkpointing
- Visualize proofs for debugging
- Benchmark different strategies for your problem domain
Good Pattern: Complete Prover Template
class ProductionTheoremProver:
"""Production-ready theorem prover with all optimizations"""
def __init__(self, config=None):
self.config = config or {
'max_iterations': 1000,
'strategy': 'sos',
'use_indexing': True,
'use_weights': True,
'simplify': True,
'timeout': 60
}
self.prover = TheoremProver(
max_iterations=self.config['max_iterations'],
strategy=self.config['strategy']
)
if self.config['use_indexing']:
self.index = ClauseIndex()
self.stats = {
'clauses_generated': 0,
'clauses_simplified': 0,
'resolutions': 0
}
def prove(self, axioms, goal, verbose=False):
"""Prove goal from axioms with full optimization"""
start_time = time.time()
# Convert to CNF
cnf_axioms = [self.to_cnf(ax) for ax in axioms]
cnf_goal = self.to_cnf(Not(goal))
# Extract and simplify clauses
clauses = self.extract_all_clauses(cnf_axioms + [cnf_goal])
if self.config['simplify']:
clauses = simplify_clause_set(clauses)
# Build index
if self.config['use_indexing']:
for clause in clauses:
self.index.add_clause(clause)
# Run prover
success, trace = self.prover.prove_basic(clauses)
elapsed = time.time() - start_time
if verbose:
self.print_stats(elapsed)
return success, trace
def print_stats(self, elapsed):
"""Print proving statistics"""
print(f"\nStatistics:")
print(f" Time: {elapsed:.4f}s")
print(f" Clauses generated: {self.stats['clauses_generated']}")
print(f" Clauses after simplification: {self.stats['clauses_simplified']}")
print(f" Resolution steps: {self.stats['resolutions']}")
Bad Pattern: Incomplete Implementation
def bad_prover(axioms, goal):
"""BAD: Missing critical features"""
# WRONG: No CNF conversion
# WRONG: No simplification
# WRONG: No indexing
# WRONG: No heuristics
# WRONG: No timeout
# WRONG: No statistics
clauses = axioms + [Not(goal)]
while True: # WRONG: Infinite loop possible
for c1 in clauses:
for c2 in clauses:
resolvent = resolve(c1, c2)
if resolvent:
clauses.append(resolvent) # WRONG: Duplicates
if not resolvent.literals:
return True
return False
External Resources
- “Automated Reasoning” by Wos, Overbeek, Lusk, and Boyle
- “Handbook of Automated Reasoning” edited by Robinson and Voronkov
- “Logic for Computer Science” by Steve Reeves and Michael Clarke
- Prover9/Mace4 documentation: https://www.cs.unm.edu/~mccune/prover9/
- Z3 Theorem Prover: https://github.com/Z3Prover/z3
- TPTP Problem Library: http://www.tptp.org/
- Isabelle/HOL: https://isabelle.in.tum.de/
- Coq Proof Assistant: https://coq.inria.fr/
Related Articles
- Propositional Logic and SAT Solvers
- Automated Reasoning in AI
- Formal Verification Methods
- Logic Programming with Prolog
- Type Theory and Dependent Types
- Model Checking Techniques
- SMT Solvers and Applications
- Interactive Theorem Proving
First-order theorem proving continues to be an active area of research with new algorithms, optimizations, and applications emerging regularly. Whether you’re verifying software, reasoning about knowledge, or solving complex logical puzzles, understanding these fundamental techniques provides a solid foundation for automated reasoning.
Comments