Idea of Data Flow Analysis
Goal Today
Introduce the ide of data-flow analysis
- a form of semantic analysis
Why it is useful
Automatically derive information about the program
Use this information for:
- correctness checks (initialization, correct order of operations, absence of errors)
- safely enabling optimizations
Type Checking versus Data-Flow Analysis
We have seen one kind of semantic analysis - type checking
- points out type errors, wrong operand application
Example of Java fragments
boolean y; int x; x = 0; y = x + 1;
int x, y; x = 0; y = x + 1;
What was rule for type checking a sequence?
Note how type checking rules ignore order of statements
- Permute statements - same environment, same type checking result
- This type of analysis is called flow insensitive
Consider another problem: variable initialization
- ensure that each local variable is initialized before its first use
Compare this method body:
public void f() { int x, y; x = 0; y = x + 1; } public void g() { int x, y; y = x + 1; x = 0; }
Test.java:9: variable x might not have been initialized
y = x + 1; ^
Variable initialization is flow sensitive data-flow analysis
- depends on the order of operations
How do we do such analysis?
- we have seen how to modify an interpreter
- analysis is similar to interpretation, but is simpler and guaranteed to terminate
Control-Flow Graphs
Syntax trees are not best medium for analysis
- evaluation order not obvious from tree
- many kinds of statements - many cases to handle in analysis
Bytecode or machine code:
- specific for architecture
- many kinds of instructions
- useful information on e.g. types is lost
We therefore introduce control-flow graph representation
- good for analysis
- also good for code generation
Example program:
int i = n; while (i > 1) { System.out.println(i); if (i % 2 == 0) { i = i / 2; } else { i = 3*i + 1; } }
What is its control-flow graph?
How can we interpret control-flow graph?
- running it with e.g. n=3
Running the program “abstractly”
- show that x=0;y=x+1 initialized
Abstractly Interepreting the example CFG