LARA

Abstract Interpretation of a Transition System

Computing Invariants in a Transition System

Transition system is given by a set of initial states, $Init \subseteq S$ and one relation $r \subseteq S^2$.

  • relation $r$ specifies one step of a system
  • we have seen that we can derive such relation from control-flow graph
    • if we start from program that manipulates two integers $x,y$, then $r$ will be relation on states of the form $(pc,x,y)$ specifying the program point and the values of program variables

The reachable states of the system are $sp(Init, r^{*})$.

One important question is to check whether

\begin{equation*}
   sp(Init, r^*) \subseteq Good
\end{equation*}

for some set of states $Good$; whether the reachable states are well-behaved (the completement $S \setminus Good$ are the Error states, so this is same as asking $sp(Init, r^*) \cap Error = \emptyset$.

For this we seek $I$ such that:

  1. $Init \subseteq I$
  2. $sp(I,r) \subseteq I$ i.e. $sp(I,r) \cup I = I$
  3. $I \subseteq Good$

For every set $c$, define: $post(c) = Init \cup sp(c,r)$

Then we are looking for $I$ such that $post(I) = I$ i.e. a fixpoint of $post$. We have in fact shown in homework that if the set of invariants is non-empty, then the least set such that $Init \subseteq I$ and $sp(I,r) \subseteq I$ is in fact $sp(Init,r^*)$. This is the set of reachable states. Note that this set is the least fixpoint of $post$.

Suppose that $post(I) \subseteq I$. This means

\begin{equation*}
    Init \cup sp(I,r) = I
\end{equation*}

which implies both $Init \subseteq I$, and $sp(I,r) \subseteq I$. It then remains to check $I \subseteq Good$.

Approximating Invariants using a Lattice of Abstract Elements

We have seen that the set of possible invariants $I \subseteq S$ can be infinite.

Therefore, we replace it with a smaller set $A$. Each element $a \in A$ represents an entire set of states.

We formalize the notion represents using a concretization function $\gamma : A \to 2^S$.

We then require $A$ to be a lattice with order $\sqsubseteq$. Ideally, we can define lattice operations using corresponding operations on sets of states, $2^S$:

  • $x \sqsubseteq y$ iff $\gamma(x) \subseteq \gamma(y)$

More generally, what we require is one direction of implication, which states that $\gamma$ is monotonic:

\begin{equation*}
   x \sqsubseteq y \rightarrow\ \gamma(x) \subseteq \gamma(y)
\end{equation*}

Observation: Suppose $x \subseteq y$ iff $\gamma(x) \subseteq \gamma(y)$ and $(A,\sqsubseteq)$. Suppose $x \sqcup y = z$. Then $\gamma(x) \subseteq \gamma(z)$ and $\gamma(y) \subseteq \gamma(z)$, so

\begin{equation*}
   \gamma(x) \cup \gamma(y) \subseteq \gamma(x \sqcup y)
\end{equation*}

The equality need not hold in general.

Example: take $S = \mathbb{Z}$ (the integers). Let $A$ be the set of finite interval representations, extended with the set $\top$ denoting all elements. Therefore, the abstract domain is:

\begin{equation*}
    A = \{ \top,\bot \} \cup \{ (x,y) \mid x \le y \}
\end{equation*}

Define $\gamma(\top) = \mathbb{Z}$, $\gamma(\bot) = \emptyset$, and $\gamma((a,b)) = \{ x \mid a \le x \land x \le b \}$. We can prove that this order is a lattice.

We can show that it holds that:

\begin{equation*}
    (a_1,b_1) \sqcup (a_2,b_2) = (\min(a_1,a_2), \max(b_1,b_2))
\end{equation*}

Note that, e.g. $(1,5) \sqcup (10,15) = (1,15)$, so we have

\begin{equation*}
    \gamma((1,5)) \cup \gamma((10,15)) \subset \gamma(1,15)
\end{equation*}

This illustrates one place where we use approximation to be able to work with the domain $A$ instead of $2^S$.

Is $\gamma$ an injective function? yes, at least in this case. Is it injective when 'iff' holds in the monotonicity condition?

Is $\gamma$ a surjective function? no. If it was both injective and injective, the two sets would have the same number of elements. But set of all subsets of integers is bigger.

Approximating the Fixpoint

Our goal is now to approximate fixpoint using an element of the abstract domain $A$.

Remember $\gamma(a)$ denotes the meaning of $a \in A$ as the set of states.

We will find $I_a \in A$ such that we still have $post(\gamma(I_a)) = \gamma(I_a)$. It will not necessarily be the least invariant but we will have

\begin{equation*}
   sp(Init,r^*) \subseteq \gamma(I_a)
\end{equation*}

Thus, $I_a$ will approximate the set of reachable states.

We define $post^{\#} : A \to A$ to be the abstract analog of $post : 2^S \to 2^S$. We require the approximation to hold at every step, so the result of $post^{\#}$ should represent at least the states represented by the concrete post. We thus require:

\begin{equation*}
    post(\gamma(a)) \subseteq \gamma(post^{\#}(a))
\end{equation*}

for all elements $a \in A$ of the abstract domain.

Theorem: Let

  • $(A,\sqsubseteq)$ be a lattice
  • $\gamma : A \to 2^S$ be a monotonic function
  • $post^{\#} : A \to A$ be such that
  • $post(\gamma(a)) \subseteq \gamma(post^{\#}(a))$ for all $a$, and
  • $post^{\#}(I_a)=I_a$.

Then $\gamma(I_a)$ is a fixpoint of $post$ (and thus $sp(Init,r^*) \subseteq \gamma(I_a)$).

Proof:

\begin{equation*}
  post(\gamma(I_a)) \subseteq \gamma(post^{\#}(I_a)) = \gamma(I_a)
\end{equation*}

Thus $\gamma(I_a)$ is a post-fixpoint of $post$. Because $sp(Init,r^*)$ is the least fixpoint, we have $sp(Init,r^*) \subseteq \gamma(I_a)$ by e.g. Tarski's fixpoint theorem.

States and Domains in Analysis of Programs

When we construct a transition system from the program, we can take control-flow graph with statements and make program points part of the state, so if there is edge from $v_1$ to $v_2$ with command $c$ on the edge, we include into the relation $r$ all tuples

\begin{equation*}
    ((v_1,s),(v_2,s'))
\end{equation*}

where $(s,s')$ belong to the meaning of the command $c$.

Because the number of program points is finite, we have two main kinds of choices when defining the abstract lattice $A$:

  • flow-sensitive: $A$ has elements of the form $(v,c)$, specifying value for each program point, and is a product lattice of lattices for each program point. We define

\begin{equation*}
  \gamma(v,c) = \{ (v,s) \mid s \in \gamma_1(c) \}
\end{equation*}

  • flow-insensitive: $A$ has elements that ignore the program points. For example, we keep only one range for all possible values of program variables at all program points