Initialization Analysis

We follow the methodology of Designing Correct Data-Flow Analyses

Defining Properties of Interest

We represent program execution as a sequence of states $s_i$ and simple statements $c_i$ , starting with the initial state

Given a sequence

$\begin{equation*} s_0\ c_1\ s_1\ c_2\ \ldots\ s_k\ \ldots\ c_n\ s_n \end{equation*}$

we say that variable $x$ is initialized at step $k$ if one of the statements $c_1,\ldots,c_n$ is an assignment to $x$ .

We say that a variable $x$ is definitely initialized at program point $p$ , if for every execution that reaches program point $p$ , the variable $x$ is initialized at $p$ .

We would like to compute the set of variables that are definitely initialized.

Reformulation of the Property

To avoid talking about previous states in a sequence and talk only about states, we use the following trick

(an instance of a general technique of history variables)

For each variable, introduce an additional boolean flag (0 or 1) that indicates whether the variable is initialized

We have

$Var_X = \{ x_1, \ldots, x_n \}$ is the set of original variables
$Var_B = \{ b_1, \ldots, b_n \}$ the corresponding boolean flags

The state is map $c : (Var_X \cup Var_B) \to Int$

Then:

at the beginning, all flags are 0
at each assignment, we set the flag to 1

Desired property: if a statement reads a variable, then its flag must be 1

Defining Semilattice to Express Properties

We use

$\bot$ to represent that the case when there are no states of initialization flag set to 0 (either there are no states at all, or there are only states with flag 1)
- in other words: initialized variable
$\top$ to represent states where initialization flag can be anything (0 or 1)
- in other words: possibly uninitialized variable

Lattice element is map $m : Var_X \to \{\bot,\top\}$

We use a pointwise lattice

Specifying Meaning of Lattice Elements

$\begin{equation*} \gamma(m) = \{ c \mid \forall b_i \in Var_B.\ (m(x_i)=\bot \rightarrow c(b_i)=1) \} \end{equation*}$

Example: Let $Var = \{ x, y \}$ and let $m = \{(x_1,\bot),(x_2,\top)\}$ .

Then $\gamma(m)$ is the set of concrete states $c$ such that $m(b_1) = 1$ i.e. $x_1$ is initialized.

There is no constraint on $x_2$ , because $m(x_2)=\top$ .

Checking monotonicity: if $m \sqsubseteq m'$ then $\gamma(m) \subseteq \gamma(m')$

note that if element in map goes from $\bot$ to $\top$ , then $\gamma$ value gets only bigger

Initial Lattice Element

All points except entry get value $\bot$ for each variable - as usual

What can we assign to entry?

all variables are uninitialized
so all elements are $\top$

Transfer Functions

Two kinds of statements in CFG (ignoring procedure calls):

dooes not change state (e.g. test) - initialization remains same
they assign to some variable x - set initialization of x to $\bot$ (it is initialized)

Note on Procedures

What about procedure calls?

for local variables - no change, procedure cannot change them
for global variables
- who knows what procedure might do
- a safe thing is to set all global variables to $\top$

Intraprocedural analysis: analyzes one procedure at time

Interprocedural analysis: descend into procedures at call site

Using Results of Initialization Analysis

If at node $v$ we have $m(v)=\top$ and there is a statement from $v$ that reads $x$ , report error