Abstract Interpretation of a Transition System

Computing Invariants in a Transition System

Transition system is given by a set of initial states, $Init \subseteq S$ and one relation $r \subseteq S^2$ .

relation $r$ specifies one step of a system
we have seen that we can derive such relation from control-flow graph
- if we start from program that manipulates two integers $x,y$ , then $r$ will be relation on states of the form $(pc,x,y)$ specifying the program point and the values of program variables

The reachable states of the system are $sp(Init, r^{*})$ .

One important question is to check whether

$\begin{equation*} sp(Init, r^*) \subseteq Good \end{equation*}$

for some set of states $Good$ ; whether the reachable states are well-behaved (the completement $S \setminus Good$ are the Error states, so this is same as asking $sp(Init, r^*) \cap Error = \emptyset$ .

For this we seek $I$ such that:

$Init \subseteq I$
$sp(I,r) \subseteq I$ i.e. $sp(I,r) \cup I = I$
$I \subseteq Good$

For every set $c$ , define: $post(c) = Init \cup sp(c,r)$

Then we are looking for $I$ such that $post(I) = I$ i.e. a fixpoint of $post$ . We have in fact shown in homework that if the set of invariants is non-empty, then the least set such that $Init \subseteq I$ and $sp(I,r) \subseteq I$ is in fact $sp(Init,r^*)$ . This is the set of reachable states. Note that this set is the least fixpoint of $post$ .

Suppose that $post(I) \subseteq I$ . This means

$\begin{equation*} Init \cup sp(I,r) = I \end{equation*}$

which implies both $Init \subseteq I$ , and $sp(I,r) \subseteq I$ . It then remains to check $I \subseteq Good$ .

Approximating Invariants using a Lattice of Abstract Elements

We have seen that the set of possible invariants $I \subseteq S$ can be infinite.

Therefore, we replace it with a smaller set $A$ . Each element $a \in A$ represents an entire set of states.

We formalize the notion represents using a concretization function $\gamma : A \to 2^S$ .

We then require $A$ to be a lattice with order $\sqsubseteq$ . Ideally, we can define lattice operations using corresponding operations on sets of states, $2^S$ :

$x \sqsubseteq y$ iff $\gamma(x) \subseteq \gamma(y)$

More generally, what we require is one direction of implication, which states that $\gamma$ is monotonic:

$\begin{equation*} x \sqsubseteq y \rightarrow\ \gamma(x) \subseteq \gamma(y) \end{equation*}$

Observation: Suppose $x \subseteq y$ iff $\gamma(x) \subseteq \gamma(y)$ and $(A,\sqsubseteq)$ . Suppose $x \sqcup y = z$ . Then $\gamma(x) \subseteq \gamma(z)$ and $\gamma(y) \subseteq \gamma(z)$ , so

$\begin{equation*} \gamma(x) \cup \gamma(y) \subseteq \gamma(x \sqcup y) \end{equation*}$

The equality need not hold in general.

Example: take $S = \mathbb{Z}$ (the integers). Let $A$ be the set of finite interval representations, extended with the set $\top$ denoting all elements. Therefore, the abstract domain is:

$\begin{equation*} A = \{ \top,\bot \} \cup \{ (x,y) \mid x \le y \} \end{equation*}$

Define $\gamma(\top) = \mathbb{Z}$ , $\gamma(\bot) = \emptyset$ , and $\gamma((a,b)) = \{ x \mid a \le x \land x \le b \}$ . We can prove that this order is a lattice.

We can show that it holds that:

$\begin{equation*} (a_1,b_1) \sqcup (a_2,b_2) = (\min(a_1,a_2), \max(b_1,b_2)) \end{equation*}$

Note that, e.g. $(1,5) \sqcup (10,15) = (1,15)$ , so we have

$\begin{equation*} \gamma((1,5)) \cup \gamma((10,15)) \subset \gamma(1,15) \end{equation*}$

This illustrates one place where we use approximation to be able to work with the domain $A$ instead of $2^S$ .

Is $\gamma$ an injective function?

Is $\gamma$ a surjective function?

Approximating the Fixpoint

Our goal is now to approximate fixpoint using an element of the abstract domain $A$ .

Remember $\gamma(a)$ denotes the meaning of $a \in A$ as the set of states.

We will find $I_a \in A$ such that we still have $post(\gamma(I_a)) = \gamma(I_a)$ . It will not necessarily be the least invariant but we will have

$\begin{equation*} sp(Init,r^*) \subseteq \gamma(I_a) \end{equation*}$

Thus, $I_a$ will approximate the set of reachable states.

We define $post^{\#} : A \to A$ to be the abstract analog of $post : 2^S \to 2^S$ . We require the approximation to hold at every step, so the result of $post^{\#}$ should represent at least the states represented by the concrete post. We thus require:

$\begin{equation*} post(\gamma(a)) \subseteq \gamma(post^{\#}(a)) \end{equation*}$

for all elements $a \in A$ of the abstract domain.

Theorem: Let

$(A,\sqsubseteq)$ be a lattice
$\gamma : A \to 2^S$ be a monotonic function
$post^{\#} : A \to A$ be such that
$post(\gamma(a)) \subseteq \gamma(post^{\#}(a))$ for all $a$ , and
$post^{\#}(I_a)=I_a$ .

Then $\gamma(I_a)$ is a fixpoint of $post$ (and thus $sp(Init,r^*) \subseteq \gamma(I_a)$ ).

Proof:

$\begin{equation*} post(\gamma(I_a)) \subseteq \gamma(post^{\#}(I_a)) = \gamma(I_a) \end{equation*}$

Thus $\gamma(I_a)$ is a post-fixpoint of $post$ . Because $sp(Init,r^*)$ is the least fixpoint, we have $sp(Init,r^*) \subseteq \gamma(I_a)$ by e.g. Tarski's fixpoint theorem.

States and Domains in Analysis of Programs

When we construct a transition system from the program, we can take control-flow graph with statements and make program points part of the state, so if there is edge from $v_1$ to $v_2$ with command $c$ on the edge, we include into the relation $r$ all tuples

$\begin{equation*} ((v_1,s),(v_2,s')) \end{equation*}$

where $(s,s')$ belong to the meaning of the command $c$ .

Because the number of program points is finite, we have two main kinds of choices when defining the abstract lattice $A$ :

flow-sensitive: $A$ has elements of the form $(v,c)$ , specifying value for each program point, and is a product lattice of lattices for each program point. We define

$\begin{equation*} \gamma(v,c) = \{ (v,s) \mid s \in \gamma_1(c) \} \end{equation*}$

flow-insensitive: $A$ has elements that ignore the program points. For example, we keep only one range for all possible values of program variables at all program points