This is an old revision of the document!
Lecture 3 (Skeleton)
Recall what we are doing:
CR(c2)
In sequential composition we follow the rule for composition of relations. We want to get again formula with free variables x_0,y_0,x,y. So we need to do renaming. Let x_1,y_1,error_1 be fresh variables.
CR(c1 ; c2) = exists x_1,y_1,error_1. CR(c1)[x:=x_1,y:=y_1,error:=error_1] & CR(c2)[x:=x_1,y:=y_1,error:=error_1]
The base case is
CR(c)=R(c)
when c is a basic command.
==== Avoiding accumulation of equalities ====
This approach generates many variables and many frame conditions.
Ignoring error for the moment, we have, for example:
R(x=3) = (x=3 & y=y_0)
R(y=x+2) = (y=x_0 + 2 & x=x_0)
CR(x=3;y=x+2) = x_1=3 & y_1 = y_0 & y = x_1 + 2 & x = x_1
But if a variable is equal to another, it can be substituted using the substitution rules
(exists x_1. x_1=t & F(x_1)) <-> F(t)
(forall x_1. x_1=t -> F(x_1) <-> F(t)
We can apply these rules to reduce the size of formulas.
==== Approximation ====
If (F -> G) is value, we say that F is stronger than F and we say G is weaker than F.
When a formula would be too complicated, we can instead create a simpler approximate formula. To be sound, if our goal is to prove a property, we need to generate a *larger* relation, which corresponds to a weaker formula describing a relation, and a stronger verification condition. (If we were trying to identify counterexamples, we would do the opposite).
We can replace "assume F" with "assume F1" where F1 is weaker. Consequences:
* omtiting complex if conditionals (assuming both branches can happen - as in most type systems)
* replacing complex assignments with arbitrary change to variable: because x=t is havoc(x);assume(x=t) and we drop the assume
This idea is important in static analysis.
==== Symbolic execution ====
Symbolic execution converts programs into formulas by going forward. It is therefore somewhat analogous to the way an [[interpreter]] for the language would work.
Avoid renaming all the time.
SE(F,k, c1; c2) = SE(F & R(c1), k+1, c2) (update formula)
SE(F,k,(c1 [] c2); c2) = SE(F, k, c1) | SE(F,k,c2) (explore both branches)
Note: how many branches do we get?
Strongest postcondition:
\begin{equation*}
sp(P,r) = \{ s_2 \mid \exists s_1.\ s_1 \in P \land (s_1,s_2) \in r \}
\end{equation*}
Like composition of a set with a relation. It's called ''relational image'' of set $P$ under relation $r$.
Note: when proving our verification condition, instead of proving that semantics of relation implies error=false, it's same as proving that the formula for set sp(U,r) implies error=false, where U is the universal relation, or, in terms of formulas, computing the strongest postcondition of formula 'true'.
==== Weakest preconditions ====
While symbolic execution computes formula by going forward along the program syntax tree, weakest precondition computes formula by going backward.
wp(Q, x=t) =
wp(Q, assume F) =
wp(Q, assert F) =
wp(Q, c1 [] c2) =
wp(Q, c1 ; c2) =
==== Inferring Loop Invariants ====
Suppose we compute strongest postcondition in a program where we unroll loop k times.
* What does it denote?
* What is its relationship to loop invariant?
Weakening strategies
* maintain a conjunction
* drop conjuncts that do not remain true
Alternative:
* decide that you will only loop for formulas of restricted form, as in abstract interpretation and data flow analysis (next week)
===== Proving quantifier-free linear arithmetic formulas =====
Suppose that we obtain (one or more) verification conditions of the form
\begin{equation*}
F\ \rightarrow\ (\mbox{error}=\mbox{false})
\end{equation*}
whose validity we need to prove. We here assume that F contains only linear arithmetic. Note: we can check satisfiability of $F\ \land\ (\mbox{error}=\mbox{true})$. We show an algorithm to check this satisfiability.
==== Quantifier Presburger arithmetic ====
Here is the grammar:
var = x | y | z | ... (variables)
K = ... | -2 | -1 | 0 | 1 | 2 | ... (integer constants)
T ::= var | T + T | K * T (terms)
A ::= T=T | T <= T (atomic formulas)
F ::= A | F & F | F|F | ~F (formulas)
To get full Presburger arithmetic, allow existential and universal quantifiers in formula as well.
Note: we can assume we have boolean variables (such as 'error') as well, because we can represent them as 0/1 integers.
Satisfiability of quantifier-free Presburger arithmetic is decidable.
Proof: small model theorem.
==== Small model theorem for Quantifier-Free Presburger Arithmetic (QFPA) ====
First step: transform to disjunctive normal form.
Next: reduce to integer linear programming:
\begin{equation*}
A\vec x = \vec b, \qquad \vec x \geq \vec 0
\end{equation*}
where $A \in {\cal Z}^{m,n}$ and $x \in {\cal Z}^n$.
Then solve integer linear programming (ILP) problem
* [[wk>Integer Linear Programming]]
* online book chapter on ILP
* [[http://www.gnu.org/software/glpk/|GLPK]] tool
We can prove small model theorem for ILP - gives bound on search.
Short proof by {{papadimitriou81complexityintegerprogramming.pdf|Papadimitriou:
solution of Ax=b (A regular) has as components rationals of form p/q with bounded p,q
duality of linear programming
obtains bound
, which needs
bits
we could encode the problem into SAT: use circuits for addition, comparison etc.
Note: if small model theorem applies to conjunctions, it also applies to arbitrary QFPA formulas.
Moreover, one can improve these bounds. One tool based on these ideas is UCLID.
Alternative: enumerate disjuncts of DNF on demand, each disjunct is a conjunction, then use ILP techniques (often first solve the underlying linear programming problem over reals). Many SMT tools are based on this idea (along with Nelson-Oppen combination: next class).
CVC3 (successor of CVC Lite)
SMT-LIB Standard for formulas, competition
Full Presburger arithmetic
Full Presburger arithmetic is also decidable.
Approaches:
Quantifier-Elimination (Omega tool from Maryland) - see homework
Automata Theoretic approaches: LASH, MONA (as a special case)
Papers
-
Loop invariant inference for set algebra formulas:
hob-tcs.pdf
-