This is the last lab, you're almost there :) Due date is Saturday, Jan. 10th, 2009.
In this lab, you will implement range analysis, and will use it to try and prevent potential ArrayIndexOutOfBoundsException
s.
The lab can seem long at first, but if you follow the steps you should be just fine. The starting one (adding assert statements to the language) may seem unrelated, but the rest builds on it, and you will also be able to use this construct to do more checks if you want.
Add “assert” statement to the language: the syntax is simply assert(expr);. For this you will need to modify all phases of the compiler:
The run-time behaviour of a program with assert statements should be the same as if they weren't there.
Modify your control-flow graph generator from the previous lab so that the assert statements get translated to subgraphs equivalent to the following code:
assert(expr);
…translates as…
if(!expr) { error }
…where the special “action” error is represented by a new node defined in CFGTrees.scala:
case class CFGError extends CFGStatement with Positional
You can have all edges labeled with this new error statement go straight to the exit point of the function. This represents abnormal termination, for instance the raising of an AssertionError
.
Here's a simple example:
For each array read and write in the control flow graph, add just before it what would be equivalent to an assertion checking that the index is within the bounds of the array. That is, if you have an array read a[i], add the equivalent to assert(i >= 0 && i < a.length); right before it, and similarly for array writes. Make sure you modify your ASTtoCFG.scala file in a way that is not too destructive (ie. set a Boolean flag indicating whether you should add these “assertions”, or better yet, write the code to add the checks in a separate file).
For example, here's some code and the corresponding CFGs, without and with the additional check:
class SA { public static void main(String [] args) { System.out.println(new SimpleArrayAssign().run()); } } class SimpleArrayAssign { int [] arr; public int run() { arr = new int[4]; arr[3] = 42; return 42; } } |
(Don't mind the extra vertex v3 in the first graph: our implementation always generates an extra vertex to collect assertion failures. You don't see it in the second graph because it gets “compacted” by the call to fewerSkips
if they actually were assertions.)
Note that it is enough to have one error edge per array bound check (even if they are actually two checks done), since the positional information will be the same for both (the place in the source code where the index is passed).
Implement range analysis on top of your control-flow graphs. Use these files adapted from Lecture 11 to do that (note that you need to overwrite the file LabeledDirectedGraph.scala from last lab):
Your analysis should stabilize at a fix-point where for all integer variables you know at each program point some estimation of the range in which their value can be. Additionally, for each array you should keep trace of its possibles sizes in another range. You should probably take the following steps:
result#
if it is of one of these types)You should be conservative in your analysis: if you don't know anything about an integer, for instance because it was returned from a method, assume the worst case. For class members, you cannot do much either, unless you could know for sure that no method was called between a write and read to a given field (why?). You are allowed to make worst-case assumptions about values read from fields.
It should be clear that the property “an assertion can be violated” is equivalent to “the corresponding error node in the control-flow graph is reachable”. Using the information from the static analysis, traverse the control-flow graphs and emit a warning each time you are able to reach an error node. Use the positional information from that node to emit a warning such as “assertion potentially violated.”. You don't have to distinguish between the assertions you generated for the array bounds and the ones originally in the program, but in the first case, the positional information should point to the place in the source code corresponding to the array index expression.
You may realize now that your analysis can check more properties than just array access safety. If you write assert statements with reasonable test conditions, it should be able to check them too.