Collecting Semantics

Transitive Closure as Least Fixedpoint

Recall the definition of transitive closure for $r \subseteq D \times D$ a binary relation

    r^* = \bigcup_{n \ge 0} r^n

where $r^0 = \Delta_D$.

Lemma: Reflexive transitive closure is the least relation $s$ for which $f(s)=s$, where $f : D^2 \to D^2$ is given by $f(s) = \Delta \cup (s \circ r)$.

(We call $x$ for which $f(x)=x$ the fixpoint (or fixed point) of $f$.)

Note that $f$ is monotonic: if $s \subseteq s'$ then $f(s) \subseteq f(s')$.

Reachable States and Collecting Semantics

Question we wish to answer: What are the sets of reachable states at each program point?

We represent programs by control-flow graphs (CFG).

Program points are CFG nodes, denoted by $V$. Statements are labels on CFG edges.

We look at a particular way of representing and computing sets of reachable states, splitting states by program counter (control-flow graph node): collecting semantics.

$PS$ - the set of values of program variables (not including program counter).

For each program point $p$, we have the set of reachable states $g(p) \subseteq PS$.

The set of all reachable states of the program is $\{(p,s) \mid s \in g(p) \}$.

Let $p_0$ be initial program counter and $Init$ the set of values of program variables in $p_0$.


    g : V \to PS

The set of reachable states is defined as the least solution of constraints

  Init \subseteq g(p_0)

  \bigwedge_{(p_1,p_2) \in E} sp(g(p_1),r(c(p_1,p_2)))) \subseteq g(p_2)

where $c(p_1,p_2)$ is command associated with edge $(p_1,p_2)$, and $r(c(p_1,p_2))$ is the relation giving semantics for this command.

By least we mean least with respect to the following ordering on functions from control-flow graph points to sets: $g_1 \sqsubseteq g_2$ iff $\forall p. g_1(p)\subseteq g_2(p)$.

Lemma: Let $g$ be a least solution of the collecting semantics constraints. Then $s \in g(p)$ iff: there exists $s_0 \in Init$ and there exists a finite path through CFG that starts in $p_0$ with state $s_0$ and ends in $p$ with state $s$.

Proof: Consider the following $g^*$ given by collecting all states $(p,s)$ for which there exists such a path terminating at point $p$ in state $s$. We show that $g^*$ is the least solution of the constraints.

We first show that it is a solution. It satisfies equation $I \subseteq g^*(p_0)$. For the second equation, if we take element $s_1 \in g^*(p_1)$ then it is given by some path of length $n$ terminating at node $p_1$. Then the elements in $sp(\{s_1\},r(c(p_1,p_2))))$ are given by a path of length $n+1$ terminating at $p_2$. Therefore, these elements are in $g^*(p_2)$. Thus, for all $s_1 \in g^*(p_1)$, we have

   sp(\{s_1\},r(c(p_1,p_2)))) \subseteq g^*(p_2)

so the constraint is satisfied and $g^*$ is a solution.

Next, we show that $g^*$ is the least solution. Consider any solution $g$. We then show $g^*(p) \subseteq g(p)$ for all $p$. We show that all states given by paths (i.e. all states in $g^*$) are included in $g(p)$, by induction on the lengths of paths. Paths of length 0 are included by $I \subseteq g(p_0)$. The second condition guarantees that if all paths of length $n$ are included, then all paths of length $n+1$ are included.

Thus, we have shown that $g^*$ is the least solution, so the lemma follows by definition of $g^*$.

End of Proof.

Equivalent Form of the Constraints

Using the fact that the conjunction of constraints $A \subseteq C$ and $B \subseteq C$ is equivalent to

    C = C \cup A \cup B

the above constraints for collecting semantics are equivalent to:

  g(p_0) = g(p_0) \cup I

  \bigwedge_{p_2 \in V}\ \left( g(p_2) = g(p_2) \cup \bigcup_{(p_1,p_2) \in E} sp(g(p_1),r(p_1,p_2))) \right)