LARA

Compositional Verification Condition Generator

Rules for Computing Formulas for Commands Compositionally

We compute for each command $c_1$ the formula $F_c(c_1)$. We next show how to do it. These rules all follow from the semantics of our language.

Assignment

\begin{equation*}
   F_c(x=e) = (x'=e \land \bigwedge_{v \in V \setminus \{x\}} v=v')
\end{equation*}

Assume

\begin{equation*}
   F_c({\it assume}(e)) = (e \land \bigwedge_{v \in V} v=v')
\end{equation*}

Havoc

\begin{equation*}
   F_c({\it havoc}(x)) = (\bigwedge_{v \in V \setminus\{x\}} v=v')
\end{equation*}

For simplicity of notation, in the sequel we work with state that has only one variable, x.

Union

Note

\begin{equation*}
   \{(\vec x,\vec x') \mid F_1 \} \cup \{(\vec x,\vec x') \mid F_2 \} = \{(\vec x,\vec x') \mid F_1 \lor F_2 \}
\end{equation*}

Therefore,

\begin{equation*}
   F_c(c_1 [] c_2) = F_c(c_1) \lor F_c(c_2)
\end{equation*}

Sequential Composition

Note that

\begin{equation*}
\begin{array}{l}
    \{(\vec x,\vec x') \mid F_1 \} \circ \{(\vec x,\vec x') \mid F_2 \} \\
\qquad
 = \{ (\vec x,\vec z) \mid F_1[\vec x':=\vec z] \} \circ \{ (\vec z,x') \mid F_2[\vec x:=\vec z] \} \\
\qquad
 = \{ (\vec x,\vec z) \mid \exists \vec z. F_1[\vec x':=\vec z] \land F_2[\vec x:=\vec z] \}
\end{array}
\end{equation*}

Therefore,

\begin{equation*}
   F_c(c_1\ ;\ c_2)\ =\ (\exists \vec z. F_c(c_1)[\vec x':=\vec z] \land F_c(c_2)[\vec x:=\vec z])
\end{equation*}

Above, $F_c(c_1)[\vec x':=\vec z]$ denotes taking formula $F_c(c_1)$ and replacing in it occurrences of variables $\vec x'$ by variables $\vec z$.

To avoid re-using variables, introduce always a fresh variable as $\vec z$ and denote it $\vec z_i$.

Loops?

The semantics for loops is given using transitive closure.

To express this, we would need a logic that supports transitive closure.

If we do this, our formulas will look very much like the programs and there are few good techniques for reasoning about them.

We will therefore deal with loops using alternative techniques (to be discussed later). A simple technique is unrolling loops.

Example

Take the program in the example below:

(if (x < 0) x=x+1 else x=x);
(if (y < 0) y=y+x else y=y);

It translates into the following guarded command language:

\begin{equation*}
\begin{array}{l}
(assume(x<0) ; (x = x+1)\ [] assume (\lnot (x<0)) ; skip)\ ; \\
(assume(y<0) ; (y = y+x)\ [] assume(\lnot(y<0)) ; skip)
\end{array}
\end{equation*}

Computing $F_c$ for basic commands we obtain:

\begin{equation*}
\begin{array}{rcl}
  F_c(assume(x < 0)) &=& x < 0 \land x'=x \land y'=y \\
  F_c(x=x+1) &=& x' = x + 1 \land y' = y \\
  F_c(assume(\lnot (x < 0))) &=& \lnot (x < 0) \land x'=x \land y'=y \\  
  F_c(assume(y < 0)) &=& y < 0 \land x'=x \land y'=y \\
  F_c(y=y+x) &=& y'=y+x \land x'=x \\
  F_c(assume(\lnot(y<0)) &=& \lnot (y < 0) \land x'=x \land y'=y \\
  F_c(skip) &=& x'=x \land y'=y
\end{array}
\end{equation*}

Applying innermost sequential compositions, we obtain

\begin{equation*}
\begin{array}{rcl}
  F_c(assume(x < 0);x+x+1) &=& 
  \exists x_1,y_1. 
\begin{array}[t]{l}
(x < 0 \land x'=x \land y'=y)[x':=x_1,y':=y_1] \land \\
(x' = x + 1 \land y' = y)[x:=x_1,y:=y_1] =
\end{array} \\
&=&
  \exists x_1,y_1. x < 0 \land x_1 = x \land y_1 = y \land x'=x_1+1 \land y'=y_1 \\
& & \mbox{optional simplification step} \\
&=&
  x < 0 \land x'=x+1 \land y'=y
\end{array}
\end{equation*}

We similarly obtain

\begin{equation*}
   F_c(assume (\lnot (x<0)) ; skip) = \lnot (x < 0) \land y'=y
\end{equation*}

Thus, the meaning of the first if statement is:

\begin{equation*}
   (x < 0 \land x'=x+1 \land y'=y) \lor (\lnot (x < 0) \land x'=x \land y'=y)
\end{equation*}

We can similarly show that the meaning of the second 'if' statement is:

\begin{equation*}
   (y < 0 \land x'=x \land y' = y + x) \lor (\lnot (y < 0) \land x'=x \land y'=y)
\end{equation*}

By applying renamings to $x,y$ in first formula and $x',y'$ in the second formula, we obtain:

\begin{equation*}
\begin{array}{l}
   \exists x_2,y_2.\\
   \left((x < 0 \land x_2=x+1 \land y_2=y) \lor (\lnot (x < 0) \land x_2=x \land y_2=y)\right)\ \land \\
   \left((y_2 < 0 \land x'=x_2 \land y' = y_2 + x_2) \lor (\lnot (y_2 < 0) \land x'=x_2 \land y'=y_2)\right)
\end{array}
\end{equation*}

Note that, even if we had left the existential quantifiers over $x_1,y_1$ inside, we would be able to lift them to the top level through $\land,\lor$.

Useful Consequences of Rules

If $P$ is a formula containing only unprimed variables, let $P'$ denote the formula where all program variables become primed ($x$ is replaced by $x'$). More precisely

\begin{equation*}
   P' = P[x_1 := x'_1, \ldots, x_n := x'_n]
\end{equation*}

We then have the following:

\begin{equation*}
   F_c(assume(P); c_2) = P \land F_c(c_2)
\end{equation*}

\begin{equation*}
   F_c(c_1 ; assume(P)) = F_c(c_1) \land P'
\end{equation*}

Formula Simplification: One-Point Rule

Assignments and assume statements generate equalities, many of which can be eliminated by one-point rule

\begin{equation*}
    (\exists x. x=t \land F) \leftrightarrow F[x:=t]
\end{equation*}

There are more complex quantifier elimination procedures that can be used in principle as well.

Shape of Generated Formulas

Theorem: Consider a loop-free program $c_1$. Then each formula generated using the rules above is equivalent to

\begin{equation*}
   \exists z_1,\ldots, z_n. C(F_1,\ldots,F_m)
\end{equation*}

where $F_1,\ldots,F_m$ are formulas corresponding to havoc,assume (and assignment) statements (with some variables renamed), and $C$ denotes a propositional combination that uses only $\land,\lor$, and where the free variable in the above formula are unprimed and primed program variables. The resulting formula is polynomial in the size of $c_1$.

Proof: By induction on the structure of $c_1$. The base cases is trivial, taking $F_1$ to be the formula defining the relation. For the case of non-deterministic choice the property follows easily by lifting existential quantifiers to the top level and taking disjunction of the combinations $C$. Similarly for sequential composition, where we do variable renaming and take conjunction of the combinations $C$.

To see that the 'compositional approach' generates a formula polynomial in the size of the program. Fix the set of variables $V$. Then:

  • the size of formula for each basic command is constant
  • non-deterministic choice is disjunction
  • sequential composition is conjunction (along with renaming that does not affect size)

Moreover, formula generated in such a way looks very much like the program itself, converted to static single assignment form, see

Using Computed Formulas to Obtain Verification Conditions

Suppose we wish to prove

\begin{equation*}
    \{ P \} c_1 \{ Q \}
\end{equation*}

The resulting formula has the form:

\begin{equation*}
    \forall \vec x,\vec x'.\ (P \land (\exists \vec z_1,\ldots,\vec z_n. F_c(c_1))\  \rightarrow Q)
\end{equation*}

which is equivalent to

\begin{equation*}
    \forall \vec x, \vec x',\vec z_1,\ldots,\vec z_n.  (P \land F_c(c_1) \rightarrow Q)
\end{equation*}

If we generate fresh variables for intermediate points, it suffices to prove the validity of the quantifier free formula:

\begin{equation*}
  ((P \land F_c(c_1)) \rightarrow Q)
\end{equation*}

Equivalently, we need to prove unsatisfiability of the formula

\begin{equation*}
   P \land F_c(c_1) \land \lnot Q
\end{equation*}

Further reading