Checking Initialization Using Data-Flow Analysis
We follow the methodology of Designing Correct Data-Flow Analyses
Defining Properties of Interest
We represent program execution as a sequence of program states and simple statements.
Program state is where
is a program point
is the map from variables to their values
Given a sequence
we say that variable is initialized at step
if one of the statements
is an assignment to
.
Note that if the variable is initialized, it remains initialized during that execution.
We say that a variable is definitely initialized at program point
, if for every execution of length
such that
, we have that
is initialized at step
.
We would like to compute, for each program point, which variables at that point are definitely initialized.
Then, we will emit a warning if a variable is not definitely initialized at some program point, but there is a statement reading its value that can be executed at this program point.
Reformulation of the Property
To avoid talking about previous states in a sequence and talk only about states, we use the following trick
- (an instance of a general technique of history variables)
For each variable, introduce an additional boolean flag (0 or 1) that indicates whether the variable is initialized
We have
is the set of original variables
the corresponding boolean flags
The state is map
Then:
- at the beginning, all flags are 0
- at each assignment, we set the flag to 1
Desired property: if a statement reads a variable, then the variable's flag must be 1
Defining Semilattice to Express Properties
We use
to represent that the case when there are no states of initialization flag set to 0 (either there are no states at all, or there are only states with flag 1)
- in other words: initialized variable
to represent states where initialization flag can be anything (0 or 1)
- in other words: possibly uninitialized variable
Lattice element is map
We use a pointwise lattice
Specifying Meaning of Lattice Elements
Example: Let and let
.
Then is the set of concrete states
such that
i.e.
is initialized.
There is no constraint on , because
.
Checking monotonicity: if then
- note that if element in map goes from
to
, then
value gets only bigger
Initial Lattice Element
All points except entry get value for each variable - as usual
What can we assign to entry?
- all variables are uninitialized
- so all elements are
Transfer Functions
Two kinds of statements in CFG (ignoring procedure calls):
- dooes not change state (e.g. test) - initialization remains same
- they assign to some variable x - set initialization of x to
(it is initialized)
Formally,
and
if does not assign to any variable.
Note on Procedures
What about procedure calls?
- for local variables - no change, procedure cannot change them
- for global variables
- who knows what procedure might do
- a safe thing is to set all global variables to
Intraprocedural analysis: analyzes one procedure at time
Interprocedural analysis: descend into procedures at call site
Using Results of Initialization Analysis
If at node we have
and there is a statement from
that reads
, report error “reading a possibly uninitialized variable x”.