LARA

Variable Range Analysis for Example Program

The general form of abstract interpretation of the collecting semantics is analogous to collecting semantics, but replaces operations on sets with operations on the lattice:

\begin{equation*}
   F^\# : (V \to A) \to (V \to A)
\end{equation*}

\begin{equation*}
   F(g^\#)(v') = g^\#_{init}(v') \sqcup \bigsqcup_{(v,v') \in E} sp^\#(g^\#(v),r(v,v'))
\end{equation*}

Here $g^#_{init}(v')$ will be $\bot$ except at the entry into our control-flow graph, where it approximates the set of initial states at the entry point.

Abstract Analysis Domain

In Collecting Semantics for Example Program we had representation for all possible sets of states:

\begin{equation*}
   C = 2^\mathbb{Z}
\end{equation*}

Here we have representation of only certain states, namely intervals:

\begin{equation*}
   A =
\begin{array}[t]{l}
  \{ \bot \}\ \cup \\
  \{(-\infty,q] \mid q \in \mathbb{Z} \} \cup \\
  \{[p,+\infty) \mid p \in \mathbb{Z} \} \cup \\
  \{[p,q] \mid p \le q \} \cup \\
  \{ \top \}
\end{equation*}

The meaning of domain elements is given by a monotonic concretization function $\gamma : A \to C$:

\begin{equation*}
\begin{array}[t]{rcl}
  \gamma(\bot) &=& \emptyset \\
  \gamma(\{(-\infty,q]) &=& \{ x \mid x \le q\} \\
  \gamma(\{[p,+\infty)) &=& \{x \mid p \le x \} \\
  \gamma(\{[p,q]) &=& \{x \mid p \le x \land x \le q \} \\
  \gamma(\top) &=& \mathbb{Z}
\end{array}
\end{equation*}

From monotonicity and $a_1 \sqsubseteq a_1 \sqcup a_2$ it follows

\begin{equation*}
   \gamma(a_1)\ \subseteq\ \gamma(a_1 \sqcup a_2)
\end{equation*}

and thus

\begin{equation*}
   \gamma(a_1) \cup \gamma(a_2)\ \subseteq\ \gamma(a_1 \sqcup a_2)
\end{equation*}

We try to define $\gamma$ to be as small as possible while satisfying this condition.

Define abstraction function $\alpha : C \to A$ such that

  • $\alpha(s) = [\min s, \max s]$ if those values exist (set is bounded from below and above)
  • $\alpha(s) = [\min s, +\infty)$ if there is lower but no upper bound
  • $\alpha(s) = (-\infty, \max s]$ if there is upper but no lower bound
  • $\alpha(s) = \top$ if there is no upper and no lower bound
  • $\alpha(\emptyset) = \bot$

Lemma: The pair $(\alpha,\gamma)$ form a Galois Connection.

By property of Galois Connection, the condition $\gamma(a_1) \cup \gamma(a_2)\ \subseteq\ \gamma(a_1 \sqcup a_2)$ is equivalent to

\begin{equation*}
   \alpha(\gamma(a_1) \cup \gamma(a_2))\ \sqsubseteq\ a_1 \sqcup a_2
\end{equation*}

To make $a_1 \sqcup a_2$ as small as possible, we let the equality hold, defining

\begin{equation*}
   a_1 \sqcup a_2 \ =\ \alpha(\gamma(a_1) \cup \gamma(a_2))
\end{equation*}

For example,

\begin{equation*}
\begin{array}{rcl}
   [0,2] \sqcup [5,8] 
&=& \alpha(\gamma([0,2]) \cup \gamma([5,8])) \\
&=& \alpha(\{0,1,2, 5,6,7,8\}) \\
&=& [0,8]
\end{array}
\end{equation*}

Abstract Postcondition

We had: $sp(\cdot,c) : C \to C$

Now we have: $sp^\#(\cdot,c) : A \to A$

For correctness, the condition from Abstract Interpretation Recipe is that for each $a \in A$ and each command $r$:

\begin{equation*}
    sp(\gamma(a),r) \subseteq \gamma(sp^\#(a,r))
\end{equation*}

We would like $sp^\#$ to be as small as possible so that this condition holds

By property of Galois Connection, the condition $sp(\gamma(a),r) \subseteq \gamma(sp^\#(a,r))$ is equivalent to

\begin{equation*}
   \alpha(sp(\gamma(a),r)) \sqsubseteq sp^\#(a,r)
\end{equation*}

Because we want $sp^\#$ to be as small as possible (to obtain correct result), we let equality hold:

\begin{equation*}
   sp^\#(a,r) = \alpha(sp(\gamma(a),r))
\end{equation*}

Because we know $\alpha,\gamma,sp$, we can compute the value of $sp^\#(a,r)$ by simplifying certain expressions involving sets of states.

Example: for $p \le q$ we have:

\begin{equation*}
\begin{array}{rcl}
sp^\#([p,q],x:=x+3) 
  &=& \alpha(sp(\gamma([p,q]),x:=x+3)) \\
  &=& \alpha(sp(\{x \mid \ p \le x \land x \le q\},x:=x+3)) \\
  &=& \alpha(\{x+3 \mid p \le x \land x \le q\}) \\
  &=& \alpha(\{y \mid p+3 \le y \land y \le q+3\}) \\
  &=& [p+3,q+3]
\end{array}
\end{equation*}

For $K$ an integer constant and $a \neq \bot$, we have

\begin{equation*}
sp^\#(a,x:=K) = [K,K]
\end{equation*}

Note that for every command given by relation $r$, we have

\begin{equation*}
\begin{array}{rcl}
   sp^\#(\bot,r) 
&=& \alpha(sp(\gamma(\bot),r)) \\
&=& \alpha(sp(\emptyset,r)) \\
&=& \alpha(\emptyset) \\
&=& \bot
\end{array}
\end{equation*}

Abstract Semantic Function for the Program

Solving Abstract Function

Doing the analysis means computing $(F^\#)^n(\bot,\bot,\bot,\bot)$ for $n \ge 0$:

\begin{equation*}
\begin{array}{l}
   (\bot,\bot,\bot,\bot) \\
   (\top,\bot,\bot,\bot) \\
   (\top,[0,0],\bot,\bot) \\
   (\top,[0,0],[0,0],\bot) \\ 
   (\top,[0,3],[0,3],\bot) \\
   (\top,[0,3],[0,3],\bot) \\ 
   (\top,[0,6],[0,3],\bot) \\
   (\top,[0,6],[0,6],\bot) \\
   (\top,[0,9],[0,9],\bot) \\
   (\top,[0,12],[0,9],\bot) \\
   (\top,[0,12],[0,9],[10,12]) \\
   (\top,[0,12],[0,9],[10,12]) \\
   \ldots
\end{array}
\end{equation*}

Note the approximation (especially in the last step) compared to Collecting Semantics for Example Program.

Correctness

We are talking about overapproximating analysis.

In Lecture 05 we discussed that a sufficient condition is to have

\begin{equation*}
    post(\gamma(g^\#)) \subseteq \gamma(post^\#(g^\#))
\end{equation*}

When we talk about the transition relation, then we need a big $\Gamma$, which maps states including program counter. We define it by

\begin{equation*}
  \Gamma(g^\#_0,g^\#_1,g^\#_2,g^\#_3) = 
  (\gamma(g^\#_0),\gamma(g^\#_1),\gamma(g^\#_2),\gamma(g^\#_3))
\end{equation*}

The soundness condition is then:

\begin{equation*}
    post(\Gamma(g^\#)) \subseteq \Gamma(post^\#(g^\#))
\end{equation*}

where $\subseteq$ is defined pointwise for each component of the tuple. By definition of $\Gamma$, $post$, and $post^\#$, this reduces to pointwise conditions.

For example, we need to ensure that

\begin{equation*}
\begin{array}{ll}
          & sp(\gamma(g^\#_0), x:=0) \cup sp(\gamma(g^\#_2), x:=x+3)) \\
\subseteq & \gamma(sp^\#(g^\#_0, x:=0) \sqcup sp^\#(g^\#_2, x:=x+3))
\end{array}
\end{equation*}

To show this, we can start from left hand side and push $\gamma$ outside, obtaining larger and larger sets.

Exercise 1

Consider an analysis that has two integer variables, for which we track intervals, and one boolean variable, whose value we track exactly.

Give the type of $F^{\#}$ for such program.

Exercise 2

Consider the program that manipulates two integer variables $x,y$.

Consider any assignment $x=e$, where $e$ is a linear combination of integer variables, for example,

\begin{equation*}
    x = 2*x - 5*y
\end{equation*}

Consider an interval analysis that maps each variable to its value.

Describe an algorithm that will, given a syntax tree of $x=e$ and intervals for $x$ (denoted $[a_x,b_x]$) and $y$ (denoted $[a_y,b_y]$) find the new interval $[a,b]$ for $x$ after the assignment statement.

Exercise 3

a)

For a program whose state is one integer variable and whose abstraction is an interval, derive general transfer functions $sp^\#(a,c)$ for the following statements $c$, where $K$ is an arbitrary compile-time constant known in the program:

  • x= K;
  • x= x + K;
  • assume(x ⇐ K)
  • assume(x >= K)

b)

Consider a program with two integer variables, x,y. Consider analysis that stores one interval for each variable.

  • Define the domain of lattice elements $a$ that are computed for each program point.
  • Give the definition for statement $sp^{\#}(a,y=x+y+K)$

c)

Draw the control-flow graph for the following program.

// v0
x := 0;
// v1
while (x < 10) {
  // v2
  x := x + 3;
}
// v3
if (x >= 0) {
  if (x <= 15) {
    a[x]=7; // made sure index is within range
  } else {
    // v4
    error;
  }
} else {
  // v5
  error;
}

Run abstract interpretation that maintains an interval for $x$ at each program point, until you reach a fixpoint.

What are the fixpoint values at program points $v_4$ and $v_5$?