Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
sav07_lecture_3_skeleton [2007/03/20 14:44] vkuncak |
sav07_lecture_3_skeleton [2007/03/21 14:27] (current) vkuncak |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Lecture 3 (Skeleton) ====== | ====== Lecture 3 (Skeleton) ====== | ||
+ | |||
+ | Summary of what we are doing in today's class: | ||
+ | |||
+ | {{vcg-big-picture.png}} | ||
+ | |||
+ | |||
+ | ===== Verification condition generation: converting programs into formulas ===== | ||
==== Context ==== | ==== Context ==== | ||
Line 6: | Line 13: | ||
* represent programs using guarded command language, e.g. desugaring of 'if' into non-deterministic choice and assume | * represent programs using guarded command language, e.g. desugaring of 'if' into non-deterministic choice and assume | ||
* give meaning to guarded command language statements as relations | * give meaning to guarded command language statements as relations | ||
- | * we can represent relations using set comprehensions; if our program r has two state components, we can represent its meaning R(r) as | + | * we can represent relations using set comprehensions; if our program c has two state components, we can represent its meaning R( c ) as $\{((x_0,y_0),(x,y)) \mid F \}$, where F is some formula that has x,y,x_0,y_0 as free variables. |
- | <latex> | + | |
- | \{((x_0,y_0),(x,y)) \mid F \} | + | * simple values: variables are integers. Later we will talk about modeling pointers and arrays, but what we say now applies |
- | </latex> | + | |
- | where F is some formula that has x,y,x_0,y_0 as free variables. | + | |
- | Our goal is to find rules for computing R(r) that are | + | Our goal is to find rules for computing R( c ) that are |
* correct | * correct | ||
* efficient | * efficient | ||
- | * create formulas that we can prove later | + | * create formulas that we can effectively prove later |
+ | |||
+ | |||
+ | What exactly do we prove about the formula R( c ) ? | ||
+ | |||
+ | We prove that this formula is **valid**: | ||
+ | |||
+ | R( c ) -> error=false | ||
Line 29: | Line 42: | ||
R(havoc x) = frame(x) | R(havoc x) = frame(x) | ||
- | R(assume F) = F[x:=x_0, y:=y_0, error:=error_0] | + | R(assume F) = F[x:=x_0, y:=y_0, error:=error_0] & frame() |
R(assert F) = (F -> frame) | R(assert F) = (F -> frame) | ||
Line 59: | Line 72: | ||
when c is a basic command. | when c is a basic command. | ||
+ | |||
+ | |||
+ | |||
Line 74: | Line 90: | ||
But if a variable is equal to another, it can be substituted using the substitution rules | But if a variable is equal to another, it can be substituted using the substitution rules | ||
- | (exists x_1. x_1 = t & F(x_1)) <-> F(t) | + | (exists x_1. x_1=t & F(x_1)) <-> F(t) |
- | (forall x_1. x_1 = t -> F(x_1) <-> F(t) | + | (forall x_1. x_1=t -> F(x_1) <-> F(t) |
We can apply these rules to reduce the size of formulas. | We can apply these rules to reduce the size of formulas. | ||
- | ==== Papers ==== | + | |
+ | ==== Approximation ==== | ||
+ | |||
+ | If (F -> G) is value, we say that F is stronger than F and we say G is weaker than F. | ||
+ | |||
+ | When a formula would be too complicated, we can instead create a simpler approximate formula. To be sound, if our goal is to prove a property, we need to generate a *larger* relation, which corresponds to a weaker formula describing a relation, and a stronger verification condition. (If we were trying to identify counterexamples, we would do the opposite). | ||
+ | |||
+ | We can replace "assume F" with "assume F1" where F1 is weaker. Consequences: | ||
+ | * omtiting complex if conditionals (assuming both branches can happen - as in most type systems) | ||
+ | * replacing complex assignments with arbitrary change to variable: because x=t is havoc(x);assume(x=t) and we drop the assume | ||
+ | |||
+ | This idea is important in static analysis. | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== Symbolic execution ==== | ||
+ | |||
+ | Symbolic execution converts programs into formulas by going forward. It is therefore somewhat analogous to the way an [[interpreter]] for the language would work. | ||
+ | |||
+ | Avoid renaming all the time. | ||
+ | |||
+ | SE(F,k, c1; c2) = SE(F & R(c1), k+1, c2) (update formula) | ||
+ | |||
+ | SE(F,k,(c1 [] c2); c2) = SE(F, k, c1) | SE(F,k,c2) (explore both branches) | ||
+ | |||
+ | Note: how many branches do we get? | ||
+ | |||
+ | Strongest postcondition: | ||
+ | \begin{equation*} | ||
+ | sp(P,r) = \{ s_2 \mid \exists s_1.\ s_1 \in P \land (s_1,s_2) \in r \} | ||
+ | \end{equation*} | ||
+ | Like composition of a set with a relation. It's called ''relational image'' of set $P$ under relation $r$. | ||
+ | |||
+ | Note: when proving our verification condition, instead of proving that semantics of relation implies error=false, it's same as proving that the formula for set sp(U,r) implies error=false, where U is the universal relation, or, in terms of formulas, computing the strongest postcondition of formula 'true'. | ||
+ | |||
+ | ==== Weakest preconditions ==== | ||
+ | |||
+ | While symbolic execution computes formula by going forward along the program syntax tree, weakest precondition computes formula by going backward. | ||
+ | |||
+ | wp(Q, x=t) = | ||
+ | wp(Q, assume F) = | ||
+ | wp(Q, assert F) = | ||
+ | wp(Q, c1 [] c2) = | ||
+ | wp(Q, c1 ; c2) = | ||
+ | |||
+ | ==== Inferring Loop Invariants ==== | ||
+ | |||
+ | Suppose we compute strongest postcondition in a program where we unroll loop k times. | ||
+ | * What does it denote? | ||
+ | * What is its relationship to loop invariant? | ||
+ | |||
+ | Weakening strategies | ||
+ | * maintain a conjunction | ||
+ | * drop conjuncts that do not remain true | ||
+ | |||
+ | Alternative: | ||
+ | * decide that you will only loop for formulas of restricted form, as in abstract interpretation and data flow analysis (next week) | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== One useful decision procedure: Proving quantifier-free linear arithmetic formulas ===== | ||
+ | |||
+ | Suppose that we obtain (one or more) verification conditions of the form | ||
+ | \begin{equation*} | ||
+ | F\ \rightarrow\ (\mbox{error}=\mbox{false}) | ||
+ | \end{equation*} | ||
+ | |||
+ | whose validity we need to prove. We here assume that F contains only linear arithmetic. Note: we can check satisfiability of $F\ \land\ (\mbox{error}=\mbox{true})$. We show an algorithm to check this satisfiability. | ||
+ | |||
+ | ==== Quantifier Presburger arithmetic ==== | ||
+ | |||
+ | Here is the grammar: | ||
+ | |||
+ | var = x | y | z | ... (variables) | ||
+ | K = ... | -2 | -1 | 0 | 1 | 2 | ... (integer constants) | ||
+ | T ::= var | T + T | K * T (terms) | ||
+ | A ::= T=T | T <= T (atomic formulas) | ||
+ | F ::= A | F & F | F|F | ~F (formulas) | ||
+ | |||
+ | To get full Presburger arithmetic, allow existential and universal quantifiers in formula as well. | ||
+ | |||
+ | Note: we can assume we have boolean variables (such as 'error') as well, because we can represent them as 0/1 integers. | ||
+ | |||
+ | Satisfiability of quantifier-free Presburger arithmetic is decidable. | ||
+ | |||
+ | Proof: small model theorem. | ||
+ | |||
+ | ==== Small model theorem for Quantifier-Free Presburger Arithmetic (QFPA) ==== | ||
+ | |||
+ | First step: transform to disjunctive normal form. | ||
+ | |||
+ | Next: reduce to integer linear programming: | ||
+ | \begin{equation*} | ||
+ | A\vec x = \vec b, \qquad \vec x \geq \vec 0 | ||
+ | \end{equation*} | ||
+ | where $A \in {\cal Z}^{m,n}$ and $x \in {\cal Z}^n$. | ||
+ | |||
+ | Then solve integer linear programming (ILP) problem | ||
+ | * [[wk>Integer Linear Programming]] | ||
+ | * online book chapter on ILP | ||
+ | * [[http://www.gnu.org/software/glpk/|GLPK]] tool | ||
+ | |||
+ | We can prove small model theorem for ILP - gives bound on search. | ||
+ | |||
+ | Short proof by {{papadimitriou81complexityintegerprogramming.pdf|Papadimitriou}}: | ||
+ | * solution of Ax=b (A regular) has as components rationals of form p/q with bounded p,q | ||
+ | * duality of linear programming | ||
+ | * obtains bound $M = n(ma)^{2m+1}$, which needs $\log n + (2m+1)\log(ma)$ bits | ||
+ | * we could encode the problem into SAT: use circuits for addition, comparison etc. | ||
+ | |||
+ | Note: if small model theorem applies to conjunctions, it also applies to arbitrary QFPA formulas. | ||
+ | |||
+ | Moreover, one can improve these bounds. One tool based on these ideas is [[http://www.cs.cmu.edu/~uclid/|UCLID]]. | ||
+ | |||
+ | Alternative: enumerate disjuncts of DNF on demand, each disjunct is a conjunction, then use ILP techniques (often first solve the underlying linear programming problem over reals). Many SMT tools are based on this idea (along with Nelson-Oppen combination: next class). | ||
+ | * [[http://www.cs.nyu.edu/acsys/cvc3/download.html|CVC3]] (successor of CVC Lite) | ||
+ | * [[http://combination.cs.uiowa.edu/smtlib/|SMT-LIB]] Standard for formulas, competition | ||
+ | |||
+ | |||
+ | ==== Full Presburger arithmetic ==== | ||
+ | |||
+ | Full Presburger arithmetic is also decidable. | ||
+ | |||
+ | Approaches: | ||
+ | * Quantifier-Elimination (Omega tool from Maryland) - see homework | ||
+ | * Automata Theoretic approaches: LASH, MONA (as a special case) | ||
+ | |||
+ | ===== Papers ===== | ||
* Verification condition generation in Spec#: http://research.microsoft.com/~leino/papers/krml157.pdf | * Verification condition generation in Spec#: http://research.microsoft.com/~leino/papers/krml157.pdf | ||
Line 87: | Line 237: | ||
* Presburger Arithmetic (PA) bounds: {{papadimitriou81complexityintegerprogramming.pdf}} | * Presburger Arithmetic (PA) bounds: {{papadimitriou81complexityintegerprogramming.pdf}} | ||
* Specializing PA bounds: http://www.lmcs-online.org/ojs/viewarticle.php?id=43&layout=abstract | * Specializing PA bounds: http://www.lmcs-online.org/ojs/viewarticle.php?id=43&layout=abstract | ||
- |