Lecture 6

Resolution theorem proving

Summary of Transformation to Clauses

Resolution operates on sets of clauses, which is essentially a disjunctive normal form of formulas without existential quantifiers. Skolemization is the way to remove existential quantifiers at the cost of introducing new uninterpreted function symbols; skolemization preserves satisfiability.

From formulas to clauses

  • negation normal form: propagate negation to atomic formulas
  • prenex form: pull quantifiers to the top
  • Skolemization: replace existential quantifiers with Skolem constants (not quantifier elimination!)
    ALL x y. EX z. F


    ALL x y. F[z:=s(x,y)]

where s(x,y) is a new function symbol of two arguments. You can think of Skolemization as a way of swapping quantifiers

    EX s. ALL x y. F[z:=s(x,y)]

where s is existentially quantified. Because we are interested in satisfiability, we do not need to write EX s (and we cannot because we are in first-order logic).

  • conjunctive normal form: each conjunct is called a clause

We typically do not write universal quantifiers either, because we know that all variables in a clause are universally quantified.

Term models

Resolution is

  • sound (inference rules are correct)
  • complete (inference rules are sufficient to prove all valid formulas)

This is remarkable because there are arbitrary models of first-order logic (sets, numbers, etc.) and yet if something is valid in all models there is a resolution proof of it. This follows from the fact that instead of considering all models, we can consider just term models.

L - all function symbols and constants in our language (including any Skolem functions). Assume at least one constant.

Term(L) - set of all ground terms in language L, give as the set T

T ::= c | f(T,...,T)

where c is a constant in L and f is a function symbol in L. So, Term(L) contains trees, syntactic structures. They are like algebraic data types in Ocaml. They are equal only if they are identical

if  f(t1,t2) = f(s1,s2)  then t1=s1 and t2=s2

This would not hold for any function f, but holds for terms because they do not evaluate things, just put things together, like pairs.

Resolution relies on the following property:

Herbrand's theorem: If a set of clauses in language L is true in some interpretation with domain D, then it is true in an interpretation with domain Term(L).

Proof sketch: If set $S$ of clauses is true in interpretation $e$, then define interpretation $e_T$ with domain Term(L) by evaluating functions as in the term model $[[f]](t_1,t_2) = f(t_1,t_2)$, and defining the value of relations by $[[R]]e_T = \{(t_1,t_2)\mid [[R(t_1,t_2)]]e = \mbox{true} \}$ (end of sketch).

Note: this is in language without equality. With equality, we need to take equivalence classes of equal terms. So, in the presence of equality the term model can be finite.

Therefore, formula is satisfiable iff it has a term model.

Basic Idea of Resolution

Consider two clauses

 ¬F1 | R(x,f(g(y)))    (1)
 ¬R(h(u),f(v)) | F2    (2)

that is

 F1 -> R(x,f(g(y)))   (1)
 R(h(u),f(v)) -> F2   (2)

Note that x,y,u,v are all universally quantified. Moreover, by Herbrand theorem we assume they all range over terms. We would like to see where R(x,f(y)) and R(h(u),v) overlap. To get as strong consequence as possible, we look for the solution of the equation on terms

x == h(u)
f(g(y)) == f(v)


The second equation is equivalent to g(y)==v, so we get as a result the solution

x == h(u)
v == g(y)

Such solution is called the most general unifier. We can represent it as substitution (x:=h(u),v:=g(y)). There are infinitely many unifiers, but all unifiers are instances of the most general one.

In general, unification is a way to solve equations in the term model. It's main uses are

  • resolution theorem proving
  • Prolog (particular resolution strategy)
  • type inference for parameterized types (Hindley-Milner algorithm)

Unification algorithm: maintains a set of equations between terms.

  • deconstruct: $f(t_1,\ldots,t_n) = f(s_1,\ldots,s_n)$ into $t_1 = s_1, \ldots, t_n = s_n$
  • if $f(t_1,\ldots,t_n) = g(s_1,\ldots,s_m)$ is in the set for $f$ and $g$ distinct function symbols, report no solutions
  • reorient $t=x$ into $x=t$ for variable $x$ and non-variable term $t$
  • if $x=t$ is in the set and $x$ does not occur in $t$, replace $x$ with $t$ everywhere
  • remove $x=x$
  • if $x=t$ is in the set and $x$ occurs in $t$ (and $t$ is not $x$), report no solution

Resolution proof rules


  \frac{A_1 \lor F_1, \ \ \lnot A_2 \lor F_2}
       {(F_1 \lor F_2)\theta}

where $\theta$ is the most general unifier of $A_1$ and $A_2$.


  \frac{A_1 \lor A_2 \lor F}
       {(A_1 \lor F)\theta}

where $\theta$ is the most general unifier of $A_1$ and $A_2$.


Trying to prove validity of

  \exists y. \forall x. R(x,y) &\rightarrow& \forall x. \exists y. R(x,y)   \\
  \forall x. \exists y. R(x,y) &\rightarrow& \exists y. \forall x. R(x,y)

Proof that the formula(1) is valid, by proving that the negation of the formula does not end with a counter-example (same principle seen in lecture 5 to prove validity of loop invariant).

¬(∃y.∀x.R(x,y) ⇒ ∀x.∃y.R(x,y))

by: (A => B) <=> ¬A ∨ B

¬(¬∃y.∀x.R(x,y) ∨ ∀x.∃y.R(x,y))

by: ¬(A ∨ B) <=> ¬A ∧ ¬B

∃y.∀x.R(x,y) ∧ ¬∀x.∃y.R(x,y)

by: ¬(∃x.F) <=> ∀x.¬F   and   ¬(∀x.F) <=> ∃x.¬F

∃y.∀x.R(x,y) ∧ ∃x.∀y.¬R(x,y)

by: Skolemization

{R(x,s1), ¬R(s2,y)}

by: Unification   R(s2, s1)  where x = s2, s1 = y


Proof that the formula(2) is not valid, by proving that the negation of the formula ends with a counter-example.

¬(∀x.∃y.R(x,y) ⇒ ∃y.∀x.R(x,y))

by: (A => B) <=> ¬A ∨ B	

¬(¬∀x.∃y.R(x,y) ∨ ∃y.∀x.R(x,y))

by: ¬(A ∨ B) <=> ¬A ∧ ¬B

∀x.∃y.R(x,y) ∧ ¬∃y.∀x.R(x,y)

by: ¬(∃x.F) <=> ∀x.¬F   and   ¬(∀x.F) <=> ∃x.¬F

∀x.∃y.R(x,y) ∧ ∀y.∃x.¬R(x,y)

by: Skolemization

{R(x, s1(x)), ¬R(s2(y),y)}

by: Unification   s1(x)=y  and  x=s2(y) then s1(s2(y))=y
by: Adding a constant 'a'

R.{s1,s2,a} = L

Term(L) = {a, s1(a), s2(a), s1(s1(a)), …}

[R]eT = {(t,s1(t)) | t ∈ Term(L)} where eT: F ⇒ {true, false} and T ⇒ Term(L)

[s1(s2(a))]eT = s1(s2(a))

[∀.R(x,s1(x))]eT = true

∀t ∈ Term(L) [R(x,s1(x))]eT[x:=t]

∀t ∈ Term(L) ¬[R(s2(y),y) ] R(s2(y),y)]eT[y:=t]

[R]eT = {(a,s1(a)), (s1(a),s1(s1(a))), (s2(a), s2(s2(a))), …}

Then we should find a pair (tx,ty) s.t. (tx,ty) ∉ [R]. A possible counter-example is (tx,a) ∉ [R], ∀tx. The fact that we have found a counter-example imply that the initial formula is invalid.

Consider case where R denotes less than relation on integers and Ev denotes that integer is even

(ALL x. EX y. R(x,y))  &
(ALL x y z. R(x,y) --> R(x, plus(x,z)))  &
(Ev(x) | Ev(plus(x,1))) -->
  (ALL x. EX y. R(x,y) & E(y))


Theorem prover competition:

Hanbdook of Automated Reasoning:

You can also check Chapter 8 of Gallier Logic Book.

Abstract Interpretation

Strongest postcondition of loop

If we know the set S of reachable states at one program point, and we have command given by relation r, we compute

  sp(S,r) = S \odot r = \{y \mid \exists x. x \in S \land (x,y) \in r \}

We have seen that a loop reduces to transitive closure of a relation, along with non-deterministic choice and assume statements:

while (F) c === (assume(F);c)*; assume(~F)

Recursive definition of transitive closure

  r^* = \Delta \cup r \circ r^*

We then have

  S \odot r^* = S \cup ((S \odot r) \odot r^*)

If S was finite (and small), this would give us algorithm for computing the set of reachable states.

S is infinite, instead of S, we use a representation of some approximation of S. Because we want to be sure to account for all possibilities (and not miss errors), we use overapproximation of S.

Background on Lattices

Recall notions of preorder, partial order, equivalence relation.

Partial order where each two element set has

  • supremum (least upper bound, join, $\sqcup$)
  • infimum (greatest lower bound, meet, $\sqcap$)

So, we have

  x \sqsubseteq x \sqcup y & \mbox{(bound)} \\
  y \sqsubseteq x \sqcup y & \mbox{(bound)} \\
  x \sqsubseteq z \ \land\ y \sqsubseteq z\ \rightarrow\ x \sqcup y \sqsubseteq z & \mbox{(least)}

and dually for $\sqcap$.

See also Lattice (order) and, for more details, lecture notes by J.B. Nation or Chapter I of a Course in Universal Algebra.

Example Hasse diagrams for powerset of {1,2} and for constant propagation.

Properties of a lattice

  • finite lattice
  • finite ascending chain lattice

Basic notions of abstract interpretation

Let PS denote program states and $C = 2^{PS}$ denote the set of all subsets of PS, so $S \subseteq PS$ and $S \in C$ for a set of states S. We introduce an abstract analysis domain A and functions

  \gamma : A \to C \ \ \mbox{concretization function} \\

Instead of working with C, we work with simpler domain A, which is the data structure for analysis.

Examples of A:

  • mapping from variables to information about the values they can have (interval analysis, sign analysis, constant propagation)
  • formula F (usually in canonical form) such that $\gamma(F) = \{ x. F \}$

So, we will impose that our domain is partial order. It is useful to require additional property, namely that we have lattice, which have “union”-like and “intersection”-like operations.

Note that $\gamma$ allows us to define set-like operations on A that correspond to operations on A:

  a_1 \sqsubseteq a_2   \iff   \gamma(a_1) \subseteq \gamma(a_2)

Instead of $sp : C \to C$, we have its approximation $sp\# : A \to A$. To be sound we need to have

  \gamma(sp\#(a_1,r)) \supseteq sp(\gamma(a_1),r)

This is the most important condition for abstract interpretation.

One way to find $sp\#$ it is to have $\alpha$ that goes the other way

  \alpha : C \to A \ \ \mbox{abstraction function}

such that

   \gamma(\alpha(S)) \supseteq S \\
   \alpha(\gamma(a)) = a

Then we can write a definition of $sp\# : A \to A$ in terms of it:

  sp\#(a,r) = \alpha(sp(\gamma(a),r))

and we call this “the best transformer”.

Abstract computation of program semantics

In our lattice we have

  \gamma(a_1 \sqcup a_2) &\supseteq& \gamma(a_1) \cup \gamma(a_2)

because of the way we defined order.

Then we rewrite our computation of sp for the loop by replacing $sp$ with $sp\#$ and replacing $\cup$ with $\sqcup$. We obtain fixpoint equation for computing the effect of a transitive closur eof a relation (i.e. a loop):

  sp\#(a,r^*) = a \sqcup sp\#(sp\#(a,r),r^*)

We can then iterate this equation until it stabilizes.

From the properties of $\gamma$ it follows that the result is approximation of the set of reachable states.

When does it stabilize? Note that the sequence is increasing and if we ever repeat we have stabilized. If there are no infinite ascending chains, the computation terminates.

Simple example with small ascending chains but large size: constant propagation.

Example analyses

  • initialization analysis
  • constant propagation
  • sign analysis