====== Labs 12 ====== This is the last lab, you're almost there :) Due date is Saturday, Jan. 10th, 2009. In this lab, you will implement range analysis, and will use it to try and prevent potential ''[[http://java.sun.com/javase/6/docs/api/java/lang/ArrayIndexOutOfBoundsException.html|ArrayIndexOutOfBoundsException]]''s. The lab can seem long at first, but if you follow the steps you should be just fine. The starting one (adding **assert** statements to the language) may seem unrelated, but the rest builds on it, and you will also be able to use this construct to do more checks if you want. ===== Assert statements ===== Add "assert" statement to the language: the syntax is simply **assert(**//expr//**);**. For this you will need to modify all phases of the compiler: - recognize the new keyword in the lexer - add a tree node in the AST and parse the new statement - do name analysis on the asserted expression - type check the expression (it should be a Boolean) - __but__ you don't have to generate code for these statements (actual Java assertions are a mess to generate) The run-time behaviour of a program with **assert** statements should be the same as if they weren't there. ===== Transforming assertions in the control-flow graphs ===== Modify your control-flow graph generator from the [[labs_11|previous lab]] so that the **assert** statements get translated to subgraphs equivalent to the following code: assert(expr); ...translates as... if(!expr) { error } ...where the special "action" error is represented by a new node defined in CFGTrees.scala: case class CFGError extends CFGStatement with Positional You can have all edges labeled with this new error statement go straight to the exit point of the function. This represents abnormal termination, for instance the raising of an ''[[http://java.sun.com/javase/6/docs/api/java/lang/AssertionError.html|AssertionError]]''. Here's a simple example: | class SA { public static void main(String [] args) { System.out.println(new SimpleAssert().run()); } } class SimpleAssert { public int run() { assert(5 < 3); return 42; } } | {{compilation:simpleassert-run-fixed.jpg?200}} | ===== Generating "assertions" for array reads and writes ===== For each array read and write in the control flow graph, add just before it what would be equivalent to an assertion checking that the index is within the bounds of the array. That is, if you have an array read a[i], add the equivalent to **assert(**i >= 0 && i < a.length**);** right before it, and similarly for array writes. Make sure you modify your ASTtoCFG.scala file in a way that is not too destructive (ie. set a Boolean flag indicating whether you should add these "assertions", or better yet, write the code to add the checks in a separate file). For example, here's some code and the corresponding CFGs, without and with the additional check: | class SA { public static void main(String [] args) { System.out.println(new SimpleArrayAssign().run()); } } class SimpleArrayAssign { int [] arr; public int run() { arr = new int[4]; arr[3] = 42; return 42; } } | {{compilation:simplearrayassign.jpg?200}} | {{compilation:simplearrayassignwithcheck-fixed.jpg?200}} | (Don't mind the extra vertex v3 in the first graph: our implementation always generates an extra vertex to collect assertion failures. You don't see it in the second graph because it gets "compacted" by the call to ''fewerSkips'' if they actually were assertions.) Note that it is enough to have one **error** edge per array bound check (even if they are actually two checks done), since the positional information will be the same for both (the place in the source code where the index is passed). ===== Range analysis ===== Implement range analysis on top of your control-flow graphs. Use these files adapted from [[Lecture 11]] to do that (note that you need to overwrite the file LabeledDirectedGraph.scala from last lab): * [[lab11-labeleddirectedgraph|LabeledDirectedGraph.scala]] * [[lab11-lattices|Lattices.scala]] * [[lab11-dfa|DataFlowAnalysis.scala]] Your analysis should stabilize at a fix-point where for all integer variables you know at each program point some estimation of the range in which their value can be. Additionally, for each array you should keep trace of its possibles sizes in another range. You should probably take the following steps: - Collect all integer and integer array variables in your control flow graph (you can omit ''result#'' if it is of one of these types) - Define a new PointWiseRangeLattice, as in [[signanalysis.scala|SignAnalysis.scala]] - Instantiate it with the collected variables - Write the appropriate transfer function to compute the effect of CFG statements You should be conservative in your analysis: if you don't know anything about an integer, for instance because it was returned from a method, assume the worst case. For class members, you cannot do much either, unless you could know for sure that no method was called between a write and read to a given field (why?). You are allowed to make worst-case assumptions about values read from fields. ===== Emitting warnings ===== It should be clear that the property "an assertion can be violated" is equivalent to "the corresponding **error** node in the control-flow graph is reachable". Using the information from the static analysis, traverse the control-flow graphs and emit a warning each time you are able to reach an **error** node. Use the positional information from that node to emit a warning such as "assertion potentially violated.". You don't have to distinguish between the assertions you generated for the array bounds and the ones originally in the program, but in the first case, the positional information should point to the place in the source code corresponding to the array index expression. You may realize now that your analysis can check more properties than just array access safety. If you write **assert** statements with reasonable test conditions, it should be able to check them too.