Lecture 9: Fixpoints in static analysis

Recall formula propagation from Lecture 7. See correctness of formula propagation.

Static analysis as equation solving in a lattice

If instead of formulas as $\mbox{fact}(v)$ we have elements of some lattice, this condition becomes

$\begin{equation*} sp(\gamma(\mbox{fact}(u)),c) \subseteq \gamma(\mbox{fact}(v)) \end{equation*}$

(where sp in now denotes operator on states and not formulas).

By properties of Galois connection, this is equivalent to

$\begin{equation*} \alpha(sp(\gamma(\mbox{fact}(u)),c)) \sqsubseteq \mbox{fact}(v) \end{equation*}$

and we defined best transformer as $sp\#(a,c) = \alpha(sp(\gamma(a),c))$ , so this condition is equivalent to

$\begin{equation*} sp\#(\mbox{fact}(u),c) \sqsubseteq \mbox{fact}(v) \qquad\qquad (*) \end{equation*}$

We conclude that finding an inductive invariant expressed using elements of a lattice corresponds to finding a solution to a system of inequations $(*)$ for all $(u,c,v) \in E$ . We next examine the existence and computation of such solutions more generally.

Note that $sp\#(a,c)$ is a function monotonic in a. Why?

So, we are solving a conjunction of inequations

$\begin{equation*} a^i \sqsupseteq f_k(a^j) \end{equation*}$

where $a^1,\ldots,a^n$ are variables ranging over a lattice $A$ (one for each control-flow graph node), and $f_k$ are monotonic functions on $A$ (one for each edge in the control-flow graph).

From a system of equations to one equation

In lattice, we have that a conjunction of inequations

$\begin{eqnarray*} x \sqsupseteq y_1 \\ \ldots \\ x \sqsupseteq y_n \end{eqnarray*}$

is equivalent to $x = x \sqcup y_1 \sqcup \ldots \sqcup y_n$ ( $x$ is least upper bound of them).

So we can write the conjunction of inequations in the previous section as a system of equations

$\begin{eqnarray*} a^1 &=& g_1(a^1,\ldots,a^n) \\ \ \ \ldots \\ a^n &=& g_1(a^1,\ldots,a^n) \\ \end{eqnarray*}$

Now, introduce a new product lattice $\bar A = A^n$ with order $(a^1,\ldots,a^n) \sqsubseteq (b^1,\ldots,b^n)$ iff $a^1 \squbseteq b^1$ ,\ldots, $a^n \sqsubseteq b^n$ all hold. Then we have reduced the problem to finding a solution to equation

$\begin{equation*} x = G(x) \end{equation*}$

where $G((a^1,\ldots,a^n)) = (g_1(a^1,\ldots,a^n),..., g_n(a^1,\ldots,a^n))$ . Function $G$ is also monotonic.

The solution of equation of the form $x=G(x)$ is called a fixed point of function $G$ , because $G$ leaves $x$ in place.

The iterative process that we described above corresponds to picking an initial element $x$ such that $x \sqsubseteq G(x)$ and then computing $G(G(x))$ , \ldots, $G^n(x)$ .

we have $G^n(x) \sqsubseteq G^{n+1}(x)$
we have $x \sqsubseteq G^n(x)$

If it stabilizes we have $G^{n+1}(x)=G^n(x)$ , so $G^n(x)$ is a fixed point. This always happens if there are no infinite ascending chains.

What is we have arbitrary lattices? See, for example non-converging iteration in reals.

$\omega$ -continuity in a complete lattice is a requirement that for every chain $x_1 \sqsubseteq x_2 \sqsubseteq \ldots \sqsubseteq x_n \sqsubseteq \ldots$ we have

$\begin{equation*} f(\bigsqcup_i x_i) = \bigsqcup_i f(x_i) \end{equation*}$

(generalizes continuity in real numbers). For continuous functions, even if there are infinite ascending chains, the fixed point iteration converges in the limit.

In general, even without $\omega$ -continuity we know that the fixed point exists, due to Tarski fixed point theorem.

See also lecture notes in SAV07 resource page, in particular notes by Michael Schwartzbach for the finite case and slides by David Schmidt for $\omega$ -continuity. You can also check papers by Cousot & Cousot:

Widening

When a lattice has infinite ascending chain property, applying the strongest postcondition for each program point need not converge.

Example: interval analysis

One approach: redesign the lattice to make it have no infinite ascending chain

Another approach: change the fixpoint iteration itself so that it goes faster - widening.

The effect is as if we had a lattice with smaller height (but that still covers everything from $\top$ to $\bot$ ) and this lattice is overlayed on top of the bigger lattice.

Example (Michael Swartzbach's notes, Section 7): define widening using a monotone operator $w$ that is composed with the fixpoint operator $G$ . The resulting composition is again monotone and the result is similar to computing in a smaller lattice.

Example widening operator: take finite set B of bounds such that $-\infty, +\infty \in B$ . Usually $B$ also contains some constants that occur in the program. Then jump to the closest lower and upper bounds. The value can jump only finitely many times before reaching $[-\infty,+\infty]$ , but often the fixpoint can be reached before this happens.

Sometimes we first use $G$ for fixpoint iteration until we conclude that the value at some program point is not “stable” (it has not converged yet), and then we switch to

advantage: more precise
drawback: not monotonic (example), so more difficult to reason about

Often the operator is defined not just on the current fact $x$ but as a binary operator that takes $x$ and $G(x)$ , denoted $x \nabla G(x)$ , with property $a \sqcup b \sqsubseteq a \nabla b$ .

Narrowing

Once we have reached above the fixpoint (using widening or anything else), we can always apply the fixpoint operator $G$ and improve the precision of the results.

If this is too slow, we can stop at any point or use a narrowing operator, the result is sound.

Instead of using the fixpoint operator $G$ , we can use any operator that leaves our current value above fixpoint.

See Combination of Abstractions in ASTREE, Section 8.1, define $\Delta$ such that $\gamma(a) \cap \gamma(b) \subseteq \gamma(a \Delta b)$

example: take $\sqcap$ as $\Delta$ , or even some improved version

Reduced product

When we have multiple lattices for different properties, we can of course run both analyses independently.

If the analyses do not exchange information the result is same as running them one after the other - this corresponds to taking product of two lattices and operations on them.

To obtain more precise results, we allow analyses to exchange information.

Reduced product: Systematic Design of Program Analysis Frameworks, Section 10.1.

Key point: define $\sigma : A_1 \times A_2 \to A_1 \times A_2$ by

$\begin{equation*} \sigma(a_1,a_2) = \sqcap \{ (a_1',a_2') \mid \gamma(a_1') \cap \gamma(a_2') = \gamma(a_1) \cap \gamma(a_2) \} \end{equation*}$

The function $\sigma$ improves the precision of the element of the lattice without changing its meaning. Subsequent computations with these values then give more precise results.