Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
sav08:forward_symbolic_execution [2008/04/09 10:24] vkuncak |
sav08:forward_symbolic_execution [2009/03/11 12:44] philippe.suter |
||
---|---|---|---|
Line 2: | Line 2: | ||
Can be used for | Can be used for | ||
- | * verification condition generation (our main interest in this class) | + | * verification condition generation |
- | * testing | + | * executing programs |
- | * executing | + | * testing, bug finding |
+ | * proving programs correct (if extended with techniques for loop invariants) | ||
===== Basic Ideas ===== | ===== Basic Ideas ===== | ||
Generalize an interpreter to work with symbolic instead of concrete values. | Generalize an interpreter to work with symbolic instead of concrete values. | ||
+ | |||
+ | Assume the set of variables $V = \{x,y\}$. | ||
Symbolic execution maintains a formula $R(x_0,y_0,x,y)$ describing the relation between the initial state and the current state. | Symbolic execution maintains a formula $R(x_0,y_0,x,y)$ describing the relation between the initial state and the current state. | ||
Line 15: | Line 17: | ||
=== Exploiting Analogy with Execution === | === Exploiting Analogy with Execution === | ||
- | Maintain symbolic state formula $R(x_0,x)$ in the following normal form: | + | Maintain symbolic state formula $R(x_0,y_0,x,y)$ in the following normal form: |
\[ | \[ | ||
x = e_x \land y = e_y \land P | x = e_x \land y = e_y \land P | ||
\] | \] | ||
- | where $e_x$ is term describing symbolic state of $x$ and $e_y$ describes symbolic state of $y$. The remaining part of the formula $R$ is given in //path condition// $P$. It is a conjunction of all branch conditions up to current program point. We require $P$ to be expressed only in terms of $x_0$, not $x$. | + | where $e_x$ is term describing symbolic state of $x$ and $e_y$ describes symbolic state of $y$. The remaining part of the formula $R$ is given in //path condition// $P$. It is a conjunction of all branch conditions up to current program point. We require $P$ to be expressed only in terms of $x_0,y_0$, not $x,y$. |
=== Checking Assertions === | === Checking Assertions === | ||
Line 40: | Line 42: | ||
Special case: if initial condition specifies values of variables, then we know which branch to enter. | Special case: if initial condition specifies values of variables, then we know which branch to enter. | ||
+ | |||
===== Symbolic Execution Rules ===== | ===== Symbolic Execution Rules ===== | ||
Line 72: | Line 75: | ||
assume(F) | assume(F) | ||
change state: | change state: | ||
- | P := P & F | + | P := P & F[x<-e_x, y<-e_y] |
=== Havoc === | === Havoc === | ||
Line 118: | Line 121: | ||
Then we continue only from this one state. | Then we continue only from this one state. | ||
+ | ===== Execution ===== | ||
- | ===== Underapproximations ===== | + | If the first statements are |
+ | assume (x=0) | ||
+ | assume (x=1) | ||
+ | Then symbolic execution knows the exact values and can work as an interpreter. | ||
- | We next consider testing and debugging applications. | ||
- | These can generate real counterexamples (underapproximation) as opposed to providing proofs (overapproximations). | + | ===== Underapproximations ===== |
+ | |||
+ | Here we generate real counterexamples (underapproximation), we compute smaller relation than one given by program semantics. | ||
If there are no loop invariants, we can still execute program on concrete inputs. For symbolic initial state (e.g. giving upper bound of loop) execution need not terminate. | If there are no loop invariants, we can still execute program on concrete inputs. For symbolic initial state (e.g. giving upper bound of loop) execution need not terminate. | ||
- | * Bounding depth: stop executing for path | + | * Bounding depth: stop executing for paths longer than some value |
* Bounding state: small integer width, small number of objects in heap | * Bounding state: small integer width, small number of objects in heap | ||
- | * Loop unrolling: replace loop( c ) with c;c;c for some number of times | + | * Loop unrolling: replace loop( c ) with c[]c;c[]c;c;c for some number of times |
* Instantiating values: find a satisfying assignment for R, take value v of x in it and replace e_x with v (moves towards concrete execution) | * Instantiating values: find a satisfying assignment for R, take value v of x in it and replace e_x with v (moves towards concrete execution) | ||
+ | |||
+ | |||
+ | ===== Overaproximation ===== | ||
+ | |||
+ | In this case we compute overaproximation, bigger relation than the actual relation given by program semantics. | ||
+ | |||
+ | Examples: | ||
+ | * do not maintain exact expressions for variables, but only whether they are positive | ||
+ | * maintain only linear or certain polynomial constraints - if actual constraints are more complex, keep only the polynomial constraints that are guaranteed to hold | ||
+ | |||
+ | Why we approximate we can obtain | ||
+ | * termination even for programs that have loops | ||
+ | * useful information about the program (not exactly all information that we are interested in, depends on the approximation) | ||
===== References ===== | ===== References ===== |