Correctness of formula propagation

By definition of sp and our requirements of N, the propagation is correct for code without loops.

Using an example of a single loop, we explain that if the analysis terminates then it computes an inductive invariant weaker than the initial formula.

By induction, to prove that property holds after any number of loop iterations, we show that it holds initially and that it is preserved by each loop iteration.

Consider a loop

while (F) {c}

This is transformed into this control-flow graph:

When the analysis terminates, we have $\mbox{fact}_{i+1}(w) \rightarrow \mbox{fact}_i(w)$ . We therefore have

$\begin{eqnarray*} sp(\mbox{fact}_i(w), \mbox{assume}(F);c) \rightarrow \\ \mbox{fact}_i(w) \lor sp(\mbox{fact}_i(w), \mbox{assume}(F);c) \lor \ldots \rightarrow \\ N(\mbox{fact}_i(w) \lor sp(\mbox{fact}_i(w), \mbox{assume}(F);c) \lor \ldots) = \\ \mbox{fact}_{i+1}(w) \rightarrow \\ \mbox{fact}_i(w) \end{eqnarray*}$

Let $I$ denote $\mbox{fact}_i(w)$ . We have shown

$\begin{equation*} sp(I, \mbox{assume}(F);c)) \rightarrow I \end{equation*}$

That is, $I$ is preserved across every loop iteration. Note that $sp(I,\mbox{assume}(F);c)) \rightarrow I$ corresponds to

$\begin{equation*} I \rightarrow wp(I, \mbox{assume}(F);c) \end{equation*}$

that is,

$\begin{eqnarray*} I \rightarrow (F \rightarrow wp(I, c)) \\ I \land F \rightarrow wp(I,c) \end{eqnarray*}$

which we mentioned before.

Because formulas are increasingly weaker, we have

$\begin{equation*} \mbox{fact}_0(w) \rightarrow I \end{equation*}$

Because $I$ holds initially and is preserved, it holds in all states reachable from $\mbox{fact}_0(w)$ , so it is an inductive invariant.

By repeating this argument inductively starting from innermost loops, we explain the correctness of analysis of guarded command languages represented as control-flow graphs.

We can also represent an arbitrary control-flow graph as one while loop, by introducing a program counter variable denoting the node in the control flow graph. Let

E = {(u_1,c_1,v_1),...,(u_n,c_n,v_n)}

be the control flow graph where some u_i and v_i might be the same. It's meaning can be given using the following loop:

pc = u_entry;
while (pc != v_exit) {
  (assume(pc=u_1); c_1; pc = v_1) []
  (assume(pc=u_2); c_2; pc = v_2) []
               ...                []
  (assume(pc=u_n); c_n; pc = v_n)
}

With this transformation we can represent arbitrary control-flow graphs, and then we need to worry only about one loop. Moreover, we can support control-flow constructs such as goto and break.

Note: It does not matter how we obtained $\mbox{fact}_i(v)$ as long as, for the control flow graph $(V,E)$ , we can find some values $\mbox{fact}(v)$ for each node $v \in V$ , such that for each $(v,c,u) \in E$ we have

$\begin{equation*} sp(\mbox{fact}(u),c) \rightarrow \mbox{fact}(v) \end{equation*}$

In particular, we could have guessed a loop invariant using some heuristics, or postupated invariant as a formula with some unknowns and then tried to solve for these unknowns.