# Designing Correct Data-Flow Analyses

## Define Properties of Interest

Define precisely what property is needed (depends on the application), e.g.

- absence of errors
- potentially improving code performance without changing program meaning

Example: every variable assigned before being read

## Define Analysis (Semi)Lattice to Express Properties

Find a lattice that

- encodes properties of interest
- enables terminating analysis

Specify lattice top, bottom, join

## Precisely Specify Meaning of Lattice Elements

Specify **concretization function**, mapping lattice elements to sets of states

In SignAnalysis.scala, we have for for

where, assuming infinite integers we have

**Monotonicity requirement** for :

- implies , for all

Note that this implies

**Example**

Lattice element given by ,

Then

Here we wrote to mean the function .

## Compute a Sound Initial Lattice Element

Initially, the set of reachable states is empty at all points, except at CFG entry, where we have set of initial states of program

In the initial state of data-flow analysis, we will assign to each program location except entry

At entry we assign lattice element such that

## Derive Sound Transfer Functions

For each statement , define

Needs to correctly approximate the effect of statement on states given by .

Formally, for all statements and lattice elements the correctness condition is:

Here is the transfer function from Concrete Execution as Data-Flow Analysis

- takes set of states and runs statement on each state

Recall SignAnalysis.scala transfer functions

## Sketch of Correctness Proof

Why are computed results correct?

Imagine running in parallel a step of

- our data flow analysis (abstract analysis)
- Concrete Execution as Data-Flow Analysis - step computes all executions of length up to

We can show, for each

- at step results in abstract analysis approximate concrete execution of up to step

We can show that in the limit, this approximation holds

- limit of concrete data-flow analysis computes set of reachable states
- thus, the compute results approximate reachable states

This can be made completely precise using theory of lattices and fixed points