Differences

This shows you the differences between two versions of the page.

--- sav07_lecture_3_skeleton [2007/03/20 14:32]
vkuncak
+++ sav07_lecture_3_skeleton [2007/03/21 09:25]
vkuncak
@@ Line 1: / Line 1: @@
 ====== Lecture 3 (Skeleton) ======
+===== Converting programs (with simple values) to formulas =====
 ==== Context ====
@@ Line 6: / Line 9: @@
   * represent programs using guarded command language, e.g. desugaring of 'if' into non-deterministic choice and assume
   * give meaning to guarded command language statements as relations
-  * we can represent relations using set comprehensions; if our program r has two state components, we can represent its meaning R(r) as
+  * we can represent relations using set comprehensions; if our program c has two state components, we can represent its meaning R( c ) as $\{((x_0,y_0),(x,y)) \mid F  \}$, where F is some formula that has x,y,x_0,y_0 as free variables.
-<latex>
-\{((x_0,y_0),(x,y)) \mid F \}
-</latex>
-    where F is some formula that has x,y,x_0,y_0 as free variables.
-Our goal is to find rules for computing R(r) that are
+  * this is what I mean by ''simple values'': later we will talk about modeling pointers and arrays, but we will still use this as a starting point.
+Our goal is to find rules for computing R( c ) that are
   * correct
   * efficient
-  * create formulas that we can prove later
+  * create formulas that we can effectively prove later
+What exactly do we prove about the formula R( c ) ?
+We prove that this formula is **valid**:
+  R( c ) -> error=false
@@ Line 22: / Line 30: @@
 In our simple language, basic statements are assignment, havoc, assume, assert.
-R(x=t) = (x=t & y=y_0 & error=error_0)
+  R(x=t) = (x=t & y=y_0 & error=error_0)
 **Note**: all our statements will have the property that if error_0 = true, then error=true.  That is, you can never recover from an error state.  This is convenient: if we prove no errors at the end, then there were never errors in between.
@@ Line 28: / Line 36: @@
 **Note**: the condition y=y_0 & error=error_0 is called <b>frame condition</b>.  There are as many conjuncts as there are components of the state.  This can be annoying to write, so let us use shorthand frame(x) for it.  The shorthand frame(x) denotes a conjunction of v=v_0 for all v that are distinct from x (in this case y and error).  We can have zero or more variables as arguments of frame, so frame() means that nothing changes.
-R(havoc x) = frame(x)
+  R(havoc x) = frame(x)
-R(assume F) = F[x:=x_0, y:=y_0, error:=error_0]
+  R(assume F) = F[x:=x_0, y:=y_0, error:=error_0]
-R(assert F) = (F -> frame)
+  R(assert F) = (F -> frame)
 **Note**:
-x=t  is same as  havoc(x);assume(x=t)
+  x=t  is same as  havoc(x);assume(x=t)
+  assert false = crash  (stops with error)
+  assume true  = skip   (does nothing)
-assert false = crash  (stops with error)
-assume true  = skip   (does nothing)
 ==== Composing formulas using relation composition ====
-This is perhaps the most direct way of transforming programs to formulas.
+This is perhaps the most direct way of transforming programs to formulas.  It creates formulas that are linear in the size of the program.
-It creates formulas that are linear in the size of the program.
 Non-deterministic choice is union of relations, that is, disjunction of formulas:
-CR(c1; c2) = CR(c1) | CR(c2)
+  CR(c1 [] c2) = CR(c1) | CR(c2)
-==== Papers ====
+In sequential composition we follow the rule for composition of relations.  We want to get again formula with free variables x_0,y_0,x,y.  So we need to do renaming.  Let x_1,y_1,error_1 be fresh variables.
+  CR(c1 ; c2) = exists x_1,y_1,error_1.  CR(c1)[x:=x_1,y:=y_1,error:=error_1] & CR(c2)[x:=x_1,y:=y_1,error:=error_1]
+The base case is
+  CR(c)=R(c)
+when c is a basic command.
+==== Avoiding accumulation of equalities ====
+This approach generates many variables and many frame conditions.
+Ignoring error for the moment, we have, for example:
+  R(x=3) = (x=3 & y=y_0)
+  R(y=x+2) = (y=x_0 + 2 & x=x_0)
+  CR(x=3;y=x+2) = x_1=3 & y_1 = y_0 & y = x_1 + 2 & x = x_1
+But if a variable is equal to another, it can be substituted using the substitution rules
+  (exists x_1. x_1=t & F(x_1))     <->    F(t)
+  (forall x_1. x_1=t -> F(x_1)     <->    F(t)
+We can apply these rules to reduce the size of formulas.
+==== Approximation ====
+If (F -> G) is value, we say that F is stronger than F and we say G is weaker than F.
+When a formula would be too complicated, we can instead create a simpler approximate formula.  To be sound, if our goal is to prove a property, we need to generate a *larger* relation, which corresponds to a weaker formula describing a relation, and a stronger verification condition.  (If we were trying to identify counterexamples, we would do the opposite).
+We can replace "assume F" with "assume F1" where F1 is weaker.  Consequences:
+  * omtiting complex if conditionals (assuming both branches can happen - as in most type systems)
+  * replacing complex assignments with arbitrary change to variable: because x=t is havoc(x);assume(x=t) and we drop the assume
+This idea is important in static analysis.
+==== Symbolic execution ====
+Symbolic execution converts programs into formulas by going forward.  It is therefore somewhat analogous to the way an [[interpreter]] for the language would work.  It is based on the notion of strongest postcondition.
+==== Weakest preconditions ====
+While symbolic execution computes formula by going forward along the program syntax tree, weakest precondition computes formula by going backward.
+==== Inferring Loop Invariants ====
+Suppose we compute strongest postcondition in a program where we unroll loop k times.
+  * What does it denote?
+  * What is its relationship to loop invariant?
+Weakening strategies
+  * maintain a conjunction
+  * drop conjuncts that do not remain true
+Alternative:
+  * decide that you will only loop for formulas of restricted form, as in abstract interpretation and data flow analysis (next week)
+===== Proving quantifier-free linear arithmetic formulas =====
+===== Papers =====
   * Verification condition generation in Spec#: http://research.microsoft.com/~leino/papers/krml157.pdf
@@ Line 57: / Line 138: @@
   * Presburger Arithmetic (PA) bounds: {{papadimitriou81complexityintegerprogramming.pdf}}
   * Specializing PA bounds: http://www.lmcs-online.org/ojs/viewarticle.php?id=43&layout=abstract