LARA

Idea of Data Flow Analysis

Goal Today

Introduce the ide of data-flow analysis

  • a form of semantic analysis

Why it is useful

Automatically derive information about the program

Use this information for:

  • correctness checks (initialization, correct order of operations, absence of errors)
  • safely enabling optimizations

Type Checking versus Data-Flow Analysis

We have seen one kind of semantic analysis - type checking

  • points out type errors, wrong operand application

Example of Java fragments

boolean y;
int x;
x = 0;
y = x + 1;
int x, y;
x = 0;
y = x + 1;

What was rule for type checking a sequence?

Note how type checking rules ignore order of statements

  • Permute statements - same environment, same type checking result
  • This type of analysis is called flow insensitive

Consider another problem: variable initialization

  • ensure that each local variable is initialized before its first use

Compare this method body:

  public void f() {
int x, y;
x = 0;
y = x + 1;
  }
  public void g() {
int x, y;
y = x + 1;
x = 0;
  }

Test.java:9: variable x might not have been initialized

      y = x + 1;
          ^

Variable initialization is flow sensitive data-flow analysis

  • depends on the order of operations

How do we do such analysis?

  • we have seen how to modify an interpreter
  • analysis is similar to interpretation, but is simpler and guaranteed to terminate

Control-Flow Graphs

Syntax trees are not best medium for analysis

  • evaluation order not obvious from tree
  • many kinds of statements - many cases to handle in analysis

Bytecode or machine code:

  • specific for architecture
  • many kinds of instructions
  • useful information on e.g. types is lost

We therefore introduce control-flow graph representation

  • good for analysis
  • also good for code generation

Example program:

int i = n;
while (i > 1) {
  System.out.println(i);
  if (i % 2 == 0) {
    i = i / 2;
  } else {
    i = 3*i + 1;
  }
}

What is its control-flow graph?

How can we interpret control-flow graph?

  • running it with e.g. n=3

Running the program “abstractly”

  • show that x=0;y=x+1 initialized

Abstractly Interepreting the example CFG