LARA

Abstract Interpretation of a Transition System

Computing Invariants in a Transition System

Transition system is given by a set of initial states, $Init \subseteq S$ and one relation $r \subseteq S^2$.

  • relation $r$ specifies one step of a system
  • we have seen that we can derive such relation from control-flow graph
    • if we start from program that manipulates two integers $x,y$, then $r$ will be relation on states of the form $(pc,x,y)$ specifying the program point and the values of program variables

The reachable states of the system are $sp(Init, r^{*})$.

One important question is to check whether

\begin{equation*}
   sp(Init, r^*) \subseteq Good
\end{equation*}

for some set of states $Good$; whether the reachable states are well-behaved (the completement $S \setminus Good$ are the Error states, so this is same as asking $sp(Init, r^*) \cap Error = \emptyset$.

For this we seek $I$ such that:

  1. $Init \subseteq I$
  2. $sp(I,r) \subseteq I$ i.e. $sp(I,r) \cup I = I$
  3. $I \subseteq Good$

For every set $c$, define: $post(c) = Init \cup sp(c,r)$

Then we are looking for $I$ such that $post(I) = I$ i.e. a fixpoint of $post$. We have in fact shown in homework that if the set of invariants is non-empty, then the least set such that $Init \subseteq I$ and $sp(I,r) \subseteq I$ is in fact $sp(Init,r^*)$. This is the set of reachable states. Note that this set is the least fixpoint of $post$.

Suppose that $post(I) \subseteq I$. This means

\begin{equation*}
    Init \cup sp(I,r) = I
\end{equation*}

which implies both $Init \subseteq I$, and $sp(I,r) \subseteq I$. It then remains to check $I \subseteq Good$.

Approximating Invariants using a Lattice of Abstract Elements

We have seen that the set of possible invariants $I \subseteq S$ can be infinite.

Therefore, we replace it with a smaller set $A$. Each element $a \in A$ represents an entire set of states.

We formalize the notion represents using a concretization function $\gamma : A \to 2^S$.

We then require $A$ to be a lattice with order $\sqsubseteq$. Ideally, we can define lattice operations using corresponding operations on sets of states, $2^S$:

  • $x \sqsubseteq y$ iff $\gamma(x) \subseteq \gamma(y)$

More generally, what we require is one direction of implication, which states that $\gamma$ is monotonic:

\begin{equation*}
   x \sqsubseteq y \rightarrow\ \gamma(x) \subseteq \gamma(y)
\end{equation*}

Observation: Suppose $x \subseteq y$ iff $\gamma(x) \subseteq \gamma(y)$ and $(A,\sqsubseteq)$. Suppose $x \sqcup y = z$. Then $\gamma(x) \subseteq \gamma(z)$ and $\gamma(y) \subseteq \gamma(z)$, so

\begin{equation*}
   \gamma(x) \cup \gamma(y) \subseteq \gamma(x \sqcup y)
\end{equation*}

The equality need not hold in general.

Example: take $S = \mathbb{Z}$ (the integers). Let $A$ be the set of finite interval representations, extended with the set $\top$ denoting all elements. Therefore, the abstract domain is:

\begin{equation*}
    A = \{ \top,\bot \} \cup \{ (x,y) \mid x \le y \}
\end{equation*}

Define $\gamma(\top) = \mathbb{Z}$, $\gamma(\bot) = \emptyset$, and $\gamma((a,b)) = \{ x \mid a \le x \land x \le b \}$. We can prove that this order is a lattice.

We can show that it holds that:

\begin{equation*}
    (a_1,b_1) \sqcup (a_2,b_2) = (\min(a_1,a_2), \max(b_1,b_2))
\end{equation*}

Note that, e.g. $(1,5) \sqcup (10,15) = (1,15)$, so we have

\begin{equation*}
    \gamma((1,5)) \cup \gamma((10,15)) \subset \gamma(1,15)
\end{equation*}

This illustrates one place where we use approximation to be able to work with the domain $A$ instead of $2^S$.

Is $\gamma$ an injective function?

Is $\gamma$ a surjective function?

Approximating the Fixpoint

Our goal is now to approximate fixpoint using an element of the abstract domain $A$.

Remember $\gamma(a)$ denotes the meaning of $a \in A$ as the set of states.

We will find $I_a \in A$ such that we still have $post(\gamma(I_a)) = \gamma(I_a)$. It will not necessarily be the least invariant but we will have

\begin{equation*}
   sp(Init,r^*) \subseteq \gamma(I_a)
\end{equation*}

Thus, $I_a$ will approximate the set of reachable states.

We define $post^{\#} : A \to A$ to be the abstract analog of $post : 2^S \to 2^S$. We require the approximation to hold at every step, so the result of $post^{\#}$ should represent at least the states represented by the concrete post. We thus require:

\begin{equation*}
    post(\gamma(a)) \subseteq \gamma(post^{\#}(a))
\end{equation*}

for all elements $a \in A$ of the abstract domain.

Theorem: Let

  • $(A,\sqsubseteq)$ be a lattice
  • $\gamma : A \to 2^S$ be a monotonic function
  • $post^{\#} : A \to A$ be such that
  • $post(\gamma(a)) \subseteq \gamma(post^{\#}(a))$ for all $a$, and
  • $post^{\#}(I_a)=I_a$.

Then $\gamma(I_a)$ is a fixpoint of $post$ (and thus $sp(Init,r^*) \subseteq \gamma(I_a)$).

Proof:

\begin{equation*}
  post(\gamma(I_a)) \subseteq \gamma(post^{\#}(I_a)) = \gamma(I_a)
\end{equation*}

Thus $\gamma(I_a)$ is a post-fixpoint of $post$. Because $sp(Init,r^*)$ is the least fixpoint, we have $sp(Init,r^*) \subseteq \gamma(I_a)$ by e.g. Tarski's fixpoint theorem.

States and Domains in Analysis of Programs

When we construct a transition system from the program, we can take control-flow graph with statements and make program points part of the state, so if there is edge from $v_1$ to $v_2$ with command $c$ on the edge, we include into the relation $r$ all tuples

\begin{equation*}
    ((v_1,s),(v_2,s'))
\end{equation*}

where $(s,s')$ belong to the meaning of the command $c$.

Because the number of program points is finite, we have two main kinds of choices when defining the abstract lattice $A$:

  • flow-sensitive: $A$ has elements of the form $(v,c)$, specifying value for each program point, and is a product lattice of lattices for each program point. We define

\begin{equation*}
  \gamma(v,c) = \{ (v,s) \mid s \in \gamma_1(c) \}
\end{equation*}

  • flow-insensitive: $A$ has elements that ignore the program points. For example, we keep only one range for all possible values of program variables at all program points