Differences
This shows you the differences between two versions of the page.
sav08:forward_symbolic_execution [2009/03/11 10:47] vkuncak |
sav08:forward_symbolic_execution [2015/04/21 17:30] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Forward Symbolic Execution ====== | ||
- | Can be used for | ||
- | * verification condition generation | ||
- | * executing programs | ||
- | * testing, bug finding | ||
- | * proving programs correct (if extended with techniques for loop invariants) | ||
- | |||
- | ===== Basic Ideas ===== | ||
- | |||
- | Generalize an interpreter to work with symbolic instead of concrete values. | ||
- | |||
- | Assume the set of variables $V = \{x,y\}$. | ||
- | |||
- | Symbolic execution maintains a formula $R(x_0,y_0,x,y)$ describing the relation between the initial state and the current state. | ||
- | |||
- | === Exploiting Analogy with Execution === | ||
- | |||
- | Maintain symbolic state formula $R(x_0,x)$ in the following normal form: | ||
- | \[ | ||
- | x = e_x \land y = e_y \land P | ||
- | \] | ||
- | where $e_x$ is term describing symbolic state of $x$ and $e_y$ describes symbolic state of $y$. The remaining part of the formula $R$ is given in //path condition// $P$. It is a conjunction of all branch conditions up to current program point. We require $P$ to be expressed only in terms of $x_0,y_0$, not $x,y$. | ||
- | |||
- | === Checking Assertions === | ||
- | |||
- | Postcondition at current program point is $\exists x_0,y_0. R(x_0,y_0,x,y)$. | ||
- | |||
- | ++++What should it do when encountering assert(F)? | ||
- | |Generate proof obligation | ||
- | \[ | ||
- | \forall x,y.\ ((\exists x_0,y_0.R(x_0,y_0,x,y)) \rightarrow F(x,y)) | ||
- | \] | ||
- | that is, check the validity of $R \rightarrow F$. | ||
- | |||
- | Afterwards, we can assume(F). | ||
- | ++++ | ||
- | |||
- | === Branch Pruning === | ||
- | |||
- | If condition $P$ ever becomes contradictory, relation is empty and symbolic execution can stop (path is unreachable). | ||
- | |||
- | Special case: if initial condition specifies values of variables, then we know which branch to enter. | ||
- | |||
- | ===== Symbolic Execution Rules ===== | ||
- | |||
- | We derive these rules from semantics, taking into account the definition of R in terms of e_x,e_y,P. | ||
- | |||
- | === Assignment === | ||
- | |||
- | Given | ||
- | x = f(x,y) | ||
- | change state: | ||
- | e_x := f(e_x,e_y) | ||
- | |||
- | ++++How does this follow from relation composition?| | ||
- | \[ | ||
- | \{((x0,y0),(x,y)) \mid (x = e_x \land y = e_y \land P)\} \circ \{ ((x,y),(x',y') \mid x' = f(x,y) \land y' = y \} | ||
- | \] | ||
- | \[ | ||
- | \{((x0,y0),(x',y')) \mid \exists x,y. (x = e_x \land y = e_y \land P \land x' = f(x,y) \land y' = y \} | ||
- | \] | ||
- | \[ | ||
- | \{((x0,y0),(x',y')) \mid (x' = f(e_x,e_y) \land y' = e_y \land P \} | ||
- | \] | ||
- | \[ | ||
- | \{((x0,y0),(x,y)) \mid (x = f(e_x,e_y) \land y = e_y \land P \} | ||
- | \] | ||
- | ++++ | ||
- | |||
- | === Assume === | ||
- | |||
- | Given | ||
- | assume(F) | ||
- | change state: | ||
- | P := P & F | ||
- | |||
- | === Havoc === | ||
- | |||
- | Given | ||
- | havoc(x) | ||
- | change state: | ||
- | e_x := x_k | ||
- | where x_k is a fresh variable. | ||
- | |||
- | === Assert === | ||
- | |||
- | Given | ||
- | assert(F(x,y)) | ||
- | emit proof obligation: | ||
- | \[ | ||
- | (x = e_x \land y = e_y \land P) \rightarrow F(e_x,e_y) | ||
- | \] | ||
- | and change the state: | ||
- | x := x_k (fresh) | ||
- | y := y_k (fresh) | ||
- | P := F | ||
- | |||
- | === Sequential composition === | ||
- | |||
- | Handled implicitly: execute statements one by one. | ||
- | |||
- | === Non-deterministic choice === | ||
- | |||
- | Execute both branches and collect proof obligations from both branches. | ||
- | |||
- | Exponential explosion remedy: | ||
- | * work on control-flow graph | ||
- | * merge symbolic states when edges merge | ||
- | |||
- | **Merging:** Given two symbolic states: | ||
- | x = e_x1 & y = e_y1 & P1 | ||
- | x = e_x2 & y = e_y2 & P2 | ||
- | generate a single symbolic state | ||
- | x = x_k & y = y_k & P | ||
- | where P is | ||
- | ((x_k = e_x1 & y_k = e_y1 & P1) | | ||
- | ((x_k = e_x2 & y_k = e_y2 & P2) | ||
- | |||
- | Then we continue only from this one state. | ||
- | |||
- | ===== Execution ===== | ||
- | |||
- | If the first statements are | ||
- | assume (x=0) | ||
- | assume (x=1) | ||
- | Then symbolic execution knows the exact values and can work as an interpreter. | ||
- | |||
- | |||
- | ===== Underapproximations ===== | ||
- | |||
- | Here we generate real counterexamples (underapproximation), we compute smaller relation than one given by program semantics. | ||
- | |||
- | If there are no loop invariants, we can still execute program on concrete inputs. For symbolic initial state (e.g. giving upper bound of loop) execution need not terminate. | ||
- | * Bounding depth: stop executing for paths longer than some value | ||
- | * Bounding state: small integer width, small number of objects in heap | ||
- | * Loop unrolling: replace loop( c ) with c[]c;c[]c;c;c for some number of times | ||
- | * Instantiating values: find a satisfying assignment for R, take value v of x in it and replace e_x with v (moves towards concrete execution) | ||
- | |||
- | |||
- | ===== Overaproximation ===== | ||
- | |||
- | In this case we compute overaproximation, bigger relation than the actual relation given by program semantics. | ||
- | |||
- | Examples: | ||
- | * do not maintain exact expressions for variables, but only whether they are positive | ||
- | * maintain only linear or certain polynomial constraints - if actual constraints are more complex, keep only the polynomial constraints that are guaranteed to hold | ||
- | |||
- | Why we approximate we can obtain | ||
- | * termination even for programs that have loops | ||
- | * useful information about the program (not exactly all information that we are interested in, depends on the approximation) | ||
- | |||
- | ===== References ===== | ||
- | |||
- | * [[http://doi.acm.org/10.1145/360248.360252|Symbolic execution and program testing]] by James C. King | ||
- | * Lori Clarke and Debra Richardson: Symbolic Evaluation Methods for Program Analysis, in Program Flow Analysis: Theory and Applications, Prentice-Hall 1981. | ||
- | * [[http://www.cs.utexas.edu/users/sandip/publications/symbolic-lpar/main.html|Verification Condition Generation via Theorem Proving]] | ||
- | * [[http://osl.cs.uiuc.edu/~ksen/cute/|CUTE Tool]] |