LARA

Better Backward Analysis

Like Safety checking of machine code, we base ourselves on boolean formulas over (linear) inequalities. This is why we need more than polyhedra:

package examples;
 
import gov.nasa.jpf.rddsa.AbstrUtil;
 
class PolyBad {
    static class A {
	public int f;
	public int g;
    }
    public static void main(String[] args) {
	A x = new A();
	x.f = AbstrUtil.input();
	if (x.f >= 6 ) {
	    x.g = 2 * x.f;
	    x.g = x.g - 2;
	} else {
	    x.g = 3 * x.f;
	}
	AbstrUtil.Assert(x.g > 10);
    }
}

Using one polyhedron, we have to obtain the convex hull between the two branches of the if, thus losing all the information inside. If we allow for arbitrary boolean expressions, then we can stay path-dependent. Another example is Unsafe module.

The language of the formulas

$F$ ::= $F \wedge F | F \vee F | \neg F | F \Rightarrow F | B $
$B$ ::= $E < E | E \leq E | E = E | E \neq E | E \geq E | E > E $
$E$ ::= $E + E | E - E | E * E | E / E | c | x.f $

where $x.f$ is field $f$ of variable $x$.

The analysis lattice

We work with the lattice of boolean formulas, with $\bot$ being false and $\top$ being true. Join $\sqcup$ is $\vee$, meet $\sqcap$ is $\wedge$. Thus, $P \sqsubseteq Q$ iff $P \Rightarrow Q$.

The error condition after the instruction is always taken to be $Q$. For assert, we have to deal with the

Instruction Abstract Interpretation
$[P]$ $P \wedge Q$
$\{ P \}$ $P \Rightarrow Q$
$ x.f := E $ $Q[ x.f \rightarrow E]$
$ x.f := input() $ $ \exists ~ x.f ~ Q  $
$ x.f := * $ $ \exists ~ x.f ~ Q  $
$ x := new $ $ \exists x.f Q $ for every field $f$ of $x$
$ x := y $ $Q[ x.f \rightarrow y.f \forall f \in fields(y)]$
$ exit $ false

Induction-iteration with error conditions

Calculating error conditions for non-looping code is feasible with these rules. However, as soon as we encounter loops, our error condition might grow indefinitely. This is where the induction-iteration method is used. We adapt it to deal with error conditions as follows:

  • Let $E(0)$ be the error condition obtained from going through the loop once back to the loop head and define $E(i + 1)$ as going through the loop once more with the error condition $E(i)$.
  • The error condition of the loop is $\bigvee_{i \geq 0} E(i)$
  • Let $M(i) = \bigvee_{j \geq i \geq 0} E(i)$

If we find an $M(i)$ that is

  • False on entry of the loop
  • $E(j + 1) \Rightarrow M(j)$

Then we have found an error condition that holds through the loop.

Report

The current report can be found here: Constant propagation and backward analysis.

SVN repository: https://svn.epfl.ch/svn/rddsa

Examples

The examples can be found in the svn repository under the directory trunk/javapathfinder/extensions/rddsa/src/examples . They are written in Java and can be run with the script run_example.sh . The CFG in the intermediary language that we are treating can be obtained by using doc/make_graph.sh .

Status

  • I have all the infrastructure set up to obtain the intermediary language from Java bytecode and current state from JPF.
  • I have written examples that could benefit from analysis in the middle of program execution
  • I formulated the better backward analysis and tried it on some examples by hand
  • The implementation of this backward analysis is not complete yet

Introduction

The current execution state of a program contains a lot of useful information about future program execution. This information could be used to better determine absence of certain errors and garbage collection information. An iterative analysis, running in parallel with program execution could obtain this information. I show how to combine the results of an iterative forward and backward analysis with current program state and use that information to detect absence of errors.

  1. Create CFG
  2. Run expensive backward analysis for some time
  3. At some point, run the state-dependent forward analysis. From the two analyses, see whether we can discharge assertions.
  4. All the time in parallel, continue running program

Using a forward analysis with symbolic objects per instantiation point and a complex backward analysis, I want to assure that the program does not run into an error in the future.

Motivating example

class Example {
  static class A {
    int f;
    int g;
  }
 
  public static void main(String[] args) {
    Random r = new Random();
    A x = new A();
    x.f = Integer.valueOf(args[0]);
    x.g = r.nextInt();
    A y = new A();
 
    TM(x,y);
 
    while (x.f != 1) {
      if (x.f % 2 == 0) x.f = x.f / 2;
      else x.f := x.f * 3 + 1;
    }
 
    assert x.g == y.g;
  }
}

Where TM is a sufficiently complex procedure.

Incremental Analysis

Here is the current description of all the parts of the analysis tying in together: Constant propagation and backward analysis. There are, however, severe limitations to applicability.

  1. Define language of statement, transfer CFG
    • For loops, we need some sort of integer manipulation if we only have one-level heap, otherwise it gets boring
  2. Define forward constant propagation and its use to discharge assertions
    • At the moment, use the same abstraction of one store per instantiation point
    • Can we do this iteratively, giving better results each step?
  3. define backward analysis
  4. see how we can combine the two
    • Constant propagation results can be reused partially: If we have a concrete value for a field, it will really stay that way. Using this additional information, we can get a more precise error space.

Parallel, iterated execution

  1. Run one phase of backward analysis before program execution
  2. Update thread (to be run in parallel, separate core if possible):
    1. on fixed intervals, take the current program state and do a constant propagation
    2. if, during this execution we obtain an abstract state $\sqsubseteq$ old state (i.e. more constants present) and we are at an instruction of the form x.f := *, recalculate this instruction replacing fields whose values are known with their constant value.
    3. iterate the backward analysis until it converges again
    4. What happens if we are inside a loop?

Relevant

More, less related stuff:

Examples

References