LARA

Analyses Based on Formulas

Many analyses can be viewed in this way.

Fixpoint Approach

Let the domain be the set of quantifier-free formulas of some logic, ordered by implication

Is this a partial order? We would need to do a quotient construction to obtain it.

Non-existence of limits: infinite disjunction need not be a formula of our logic, e.g. consider formulas on linear arithmetic and take the limit

\begin{equation*}
    \bigvee_{i=1}^{\infty} y = i x
\end{equation*}

In general, a logic that can represent exact least upper bounds needs to express the conditions such as “there exists a sequence of statements that produces the given state”.

Approximation

Rule for computing formula (called 'fact') in step $k+1$ at program point $p'$

\begin{equation*}
  \mbox{fact}^{k+1}(p') := N\left(\mbox{fact}^k(p) \vee \bigvee_{(p,p') \in E} sp^{\#}(\mbox{fact}^k(p),r(p,p'))\right)
\end{equation*}

where $N$ is some approximation function. This is analogous to Abstract interpretation recipe.

Here the goal of $N$ is to ensure termination (and efficiency). The essential condition is that

\begin{equation*}
    F \rightarrow N(F)
\end{equation*}

is a valid formula.

One generic approch: use a syntactic widening.

View formula as a conjunction. Then define

\begin{equation*}
\begin{array}{l}
   N\left( (C_1 \land \ldots \land C_n) \lor (D_1 \land \ldots \land D_m) \right) = \\
\ \ \   \bigwedge (\{C_1, \ldots, C_n \} \cap \{D_1, \ldots, D_m\})
\end{array}
\end{equation*}

Why are there no infinite chains when using such $N$ ?

To make it less syntactic: deduce some consequences of the conjunction (taken from a finite set):

\begin{equation*}
\begin{array}{l}
   N\left( (C_1 \land \ldots \land C_n) \lor (D_1 \land \ldots \land D_m) \right) = \\
\ \ \   \bigwedge (conseq(\{C_1, \ldots, C_n \}) \cap conseq(\{D_1, \ldots, D_m\}))
\end{array}
\end{equation*}

where

\begin{equation*}
    conseq(S) = \{ F \in U.\ \ (S \rightarrow F) \mbox{ is valid } \}
\end{equation*}

where $U$ is some class of useful formulas.

References

Constraint-Based Approach

Give up least fixpoint requirement (it is approximation anyway).

Search for any fixpoint - it will be a set of loop invariants:

\begin{equation*}
   \left(\bigvee_{(p,p') \in E} sp^{\#}(\mbox{fact}(p),r(p,p'))\right) \rightarrow \mbox{fact}(p)
\end{equation*}

Works better if we also have postconditions and assertions that need to be satisfied (otherwise we may come up with trivial invariants).

Assume a 'template' for the invariant, e.g. (Karr's analysis) each fact is a linear equation $c_1 x_1 + \ldots + c_n x_n = c_0$ on program variables $x_1,\ldots,x_n$.

The condition for loop invariants gives a system of equations on coefficients $c_0(p),c_1(p),\ldots,c_n(p)$ for each of the finitely many program points $p$.

If there exists a solution, this gives an invariant.

X := 10
Y := 0
while (X >= 0) {
  X := X-1
  if random() { Y = Y+1 }
}

Invariant of the form $a_1 X + a_2 Y = a_0$.

\begin{equation*}
   X=10 \land Y = 0 \rightarrow a_1 X + a_2 Y = a_0
\end{equation*}

\begin{equation*}
   a_1 X + a_2 Y = a_0 \land X \ge 0 \rightarrow a_1 (X-1) + a_2 Y = a_0
\end{equation*}

\begin{equation*}
   a_1 X + a_2 Y = a_0 \land X \ge 0 \rightarrow a_1 (X-1) + a_2 (Y+1) = a_0
\end{equation*}

Such equations need to hold for all values of $X$,$Y$, which gives constraints on $a_0,a_1,a_2$ that can be solved based on properties of real fields (linear programming, Farkas lemma).

Thanks to quantifier elimination for real fields, the technique extends to polynomial invariants as well.

If there is no solution, there is no invariant of that particular template form.

Note: for non-numerical domains, we obtain different kinds of constraints. A popular class are set constraints.

References