Differences

This shows you the differences between two versions of the page.

--- sav08:forward_symbolic_execution [2009/03/11 10:47]
vkuncak
+++ sav08:forward_symbolic_execution [2015/04/21 17:30]
@@ Line 1: / Line 1: @@
-====== Forward Symbolic Execution ======
-Can be used for
-  * verification condition generation
-  * executing programs
-  * testing, bug finding
-  * proving programs correct (if extended with techniques for loop invariants)
-===== Basic Ideas =====
-Generalize an interpreter to work with symbolic instead of concrete values.
-Assume the set of variables $V = \{x,y\}$.
-Symbolic execution maintains a formula $R(x_0,y_0,x,y)$ describing the relation between the initial state and the current state.
-=== Exploiting Analogy with Execution ===
-Maintain symbolic state formula $R(x_0,x)$ in the following normal form:
-\[
-    x = e_x \land y = e_y \land P
-\]
-where $e_x$ is term describing symbolic state of $x$ and $e_y$ describes symbolic state of $y$.  The remaining part of the formula $R$ is given in //path condition// $P$.  It is a conjunction of all branch conditions up to current program point.  We require $P$ to be expressed only in terms of $x_0,y_0$, not $x,y$.
-=== Checking Assertions ===
-Postcondition at current program point is $\exists x_0,y_0. R(x_0,y_0,x,y)$.
-++++What should it do when encountering assert(F)?
-|Generate proof obligation
-\[
-    \forall x,y.\ ((\exists x_0,y_0.R(x_0,y_0,x,y)) \rightarrow F(x,y))
-\]
-that is, check the validity of $R \rightarrow F$.
-Afterwards, we can assume(F).
-++++
-=== Branch Pruning ===
-If condition $P$ ever becomes contradictory, relation is empty and symbolic execution can stop (path is unreachable).
-Special case: if initial condition specifies values of variables, then we know which branch to enter.
-===== Symbolic Execution Rules =====
-We derive these rules from semantics, taking into account the definition of R in terms of e_x,e_y,P.
-=== Assignment ===
-Given
-  x = f(x,y)
-change state:
-  e_x := f(e_x,e_y)
-++++How does this follow from relation composition?|
-\[
-\{((x0,y0),(x,y)) \mid (x = e_x \land y = e_y \land P)\} \circ \{ ((x,y),(x',y') \mid x' = f(x,y) \land y' = y \}
-\]
-\[
-\{((x0,y0),(x',y')) \mid \exists x,y. (x = e_x \land y = e_y \land P \land x' = f(x,y) \land y' = y \}
-\]
-\[
-\{((x0,y0),(x',y')) \mid (x' = f(e_x,e_y) \land y' = e_y \land P \}
-\]
-\[
-\{((x0,y0),(x,y)) \mid (x = f(e_x,e_y) \land y = e_y \land P \}
-\]
-++++
-=== Assume ===
-Given
-  assume(F)
-change state:
-  P := P & F
-=== Havoc ===
-Given
-  havoc(x)
-change state:
-  e_x := x_k
-where x_k is a fresh variable.
-=== Assert ===
-Given
-  assert(F(x,y))
-emit proof obligation:
-\[
-   (x = e_x \land y = e_y \land P) \rightarrow F(e_x,e_y)
-\]
-and change the state:
-  x := x_k (fresh)
-  y := y_k (fresh)
-  P := F
-=== Sequential composition ===
-Handled implicitly: execute statements one by one.
-=== Non-deterministic choice ===
-Execute both branches and collect proof obligations from both branches.
-Exponential explosion remedy:
-  * work on control-flow graph
-  * merge symbolic states when edges merge
-**Merging:** Given two symbolic states:
-  x = e_x1 & y = e_y1 & P1
-  x = e_x2 & y = e_y2 & P2
-generate a single symbolic state
-  x = x_k & y = y_k & P
-where P is
- ((x_k = e_x1 & y_k = e_y1 & P1) |
- ((x_k = e_x2 & y_k = e_y2 & P2)
-Then we continue only from this one state.
-===== Execution =====
-If the first statements are
-  assume (x=0)
-  assume (x=1)
-Then symbolic execution knows the exact values and can work as an interpreter.
-===== Underapproximations =====
-Here we generate real counterexamples (underapproximation), we compute smaller relation than one given by program semantics.
-If there are no loop invariants, we can still execute program on concrete inputs.  For symbolic initial state (e.g. giving upper bound of loop) execution need not terminate.
-  * Bounding depth: stop executing for paths longer than some value
-  * Bounding state: small integer width, small number of objects in heap
-  * Loop unrolling: replace loop( c ) with c[]c;c[]c;c;c for some number of times
-  * Instantiating values: find a satisfying assignment for R, take value v of x in it and replace e_x with v (moves towards concrete execution)
-===== Overaproximation =====
-In this case we compute overaproximation, bigger relation than the actual relation given by program semantics.
-Examples:
-  * do not maintain exact expressions for variables, but only whether they are positive
-  * maintain only linear or certain polynomial constraints - if actual constraints are more complex, keep only the polynomial constraints that are guaranteed to hold
-Why we approximate we can obtain
-  * termination even for programs that have loops
-  * useful information about the program (not exactly all information that we are interested in, depends on the approximation)
-===== References =====
-  * [[http://doi.acm.org/10.1145/360248.360252|Symbolic execution and program testing]] by James C. King
-  * Lori Clarke and Debra Richardson: Symbolic Evaluation Methods for Program Analysis, in Program Flow Analysis: Theory and Applications, Prentice-Hall 1981.
-  * [[http://www.cs.utexas.edu/users/sandip/publications/symbolic-lpar/main.html|Verification Condition Generation via Theorem Proving]]
-  * [[http://osl.cs.uiuc.edu/~ksen/cute/|CUTE Tool]]