Labs 06
This week you will add name analysis to your MiniJava+ compiler. This will considerably ease the task of type checking that you will start next week or the week after. Your analyzer is due on Wednesday, Nov. 5th, 8.15am.
The description of this assignment is rather long. Don't panic :) Most of it is to help you avoid forgetting important points, and we give you a substantial amount of code.
Symbols
The goal of name analysis is twofold: we want to reject programs which contain certain types of errors, and we want to associate symbols to all identifiers.
Symbols are values which uniquely identify all class, method and variable names in a program, by mapping their (multiple) occurrences to their (unique) definition. Identifiers are already present in the AST and contain names as well, but these are not sufficient to make the difference between a class member and a local variable with the same name, for instance. This week we will –among other things– add this missing information to the ASTs.
In the process of mapping occurrences to definitions, we will be able to detect the following kinds or errors:
- a class is defined more than once
- a variable is defined more than once
- a class member is overloaded (forbidden in MiniJava+)
- a method is overloaded (forbidden in MiniJava+)
- a method argument is shadowed by a local variable declaration (forbidden in Java)
- two method arguments have the same name
- a class name is used as a variable but not is declared (as parent class or type name, for instance)
- an identifier is used as a variable but not is declared
- the inheritance graph has a cycle (eg. “class A extends B {} class B extends A {}”)
- (Note that we don't check that method calls correspond to methods defined in the proper class. We will need type-checking for this.)
Additionally, we want to you to emit a warning to the user when a declared variable is never accessed (read or written). This should really be a warning, not an error.
Implementation
In order to attach symbols to trees, we define a new trait, Symbolic
, and a new set of classes for the symbols. The Symbolic
trait is parametrized by a class name which allows us to define the kind of symbols which can be attached to each kind of AST node (see Symbols.scala and Trees.scala later for examples).
You need to write your analyzer such that two nodes referencing the same symbol have the same symbol class instance attached to them (that is, reference equality, structural equality is insufficient). We defined the Symbol
class such that symbols automatically get a unique identifier attached to them at creation. This will allow you to check that you are attaching symbols correctly: you will add an option to your pretty-printer to be able to print these unique numbers along with the identifiers where they occur.
Note that Symbol
s are also Positional
objects, which mean you have to set them a correct position (you should do this by recovering the position from the correct parts of the trees). This is necessary to produce meaningful error messages such as “error: @(12,1) class Cl is defined twice. First definition here: (5,1)”.
Symbols as scopes
We will take advantage from the fact that scopes in MiniJava+ are only of three kinds:
- the global scope (the set of declared classes)
- the class scopes (all members and methods within a class, plus the global scope)
- the method scopes (the parameters and local variables, plus the correct class scope)
This in fact defines a hierarchy among symbols:
- all class symbols are defined in the global scope
- all methods are defined in a class scope
- variables are defined either in a method scope, as arguments or locals, or in a class scope
We encoded this hierarchy in the symbol classes. Therefore, if we have access to a class symbol, for instance, and all symbols were correctly set, we can access from there all method symbols and member variable symbols. This will allow us to easily check if a variable was declared, for instance.
Two phases
Here is how we recommend you proceed for the implementation:
- First, collect all symbols: create the symbol class instances, make sure all of their fields are correctly set, and attach them to the nodes of the AST where the symbols are defined.
- Make the appropriate change to your pretty printer and make sure you see the unique IDs next to the identifiers at the definition points.
- Implement the second phase of your analyzer which consists in attaching the proper symbol to the occurrences of the identifiers. To simplify your task, start by writing
lookup
methods in the symbol classes: they will allow you to easily check whether an identifier was declared and to recover its symbol if it was. Make sure you properly encode the scope rules (including shadowing) in yourlookup
methods. - Use your pretty-printer to make sure you did things correctly.
- Make sure that you throw errors when appropriate.
Execution example
When analyzing the following file:
class Example { public static void main(String [] args) { System.out.println(new B().foo()); } } class B extends A { public int foo() { val = 42; return val; } } class A { int val; public int foo() { boolean val; val = false; return 41; } }
The analyzer should output something like:
class Example#0 { public static void main(String[] args#1) { System.out.println((new B#2().foo#??())); } } class B#2 extends A#3 { public int foo#7() { val#4 = 42; return val#4; } } class A#3 { int val#4; public int foo#5() { boolean val#6; val#6 = false; return 41; } }
Note that:
- Overriding methods have a different symbol than their overridden counterparts.
- Method names in method calls are unresolved symbols.
- You can also simply ignore the “String [] args” part, since no variables can be declared in
main
and MiniJava+ doesn't know about String arrays.
Constraints
Here are all the constraints that your analyzer should enforce:1)
Variable declarations
- No two variables can have the same name in the same scope, unless one of the two cases of shadowing occurs.
- All variables used must be declared.
Shadowing
Shadowing can occur in two different situations:
- a local variable in a method can shadow a class member
- a method parameter can shadow a class member
All other types of shadowing are not allowed in MiniJava+.
Classes
- Classes must be defined only once.
- When a class is declared as extending another one, the other class must be declared and cannot be the main class.
- The transitive closure of the “extends” relation must be irreflexive (no cycles in the inheritance graph).
- When a class name is used as a type, the class must be declared.
Overloading
- Overloading is not permitted:
- In a given class, no two methods can have the same name.
- In a given class, no method can have the same name as another method defined in a super class, unless overriding applies.
Overriding
- A method in a given class overrides another one in a super class if they have the same name and the same number of arguments.2)
- Fields can not be overridden.
Stubs
As usual, here are some files to get you started:
- Symbols.scala contains the definition of symbols,
GlobalScope
and theSymbolic
trait. You only have to write the body of thelookup
methods. - Analyzer.scala contains a stub for the analyzer.
- Trees.scala shows how you can adapt your
Trees.scala
file to attach symbols to the nodes. - TreePrinter.scala contains code to help you adapt your pretty-printer.
- Compiler.scala and Main.scala contain code to run the analyzer after the parser, and to print out a tree with symbols.3) You don't have to change anything in these files.
src └── minijava ├── Compiler.scala (given this week) ├── Main.scala (given this week) ├── Positional.scala (given in Lab 02) ├── Reporter.scala (given in Lab 02) ├── TreePrinter.scala (adapt your work from Lab 04 to the new stub) │ ├── analyzer │ ├── Analyzer.scala (stub given this week) │ └── Symbols.scala (stub given this week) │ ├── lexer │ ├── Lexer.scala (completed in Lab 04) │ └── Tokens.scala (completed in Lab 04) │ └── parser ├── Parser.scala (completed in Lab 04) └── Trees.scala (adapt your work from Lab 04 to the new stub)
Deliverables
Submit a .zip
file of your src
directory through Moodle before Wednesday, Nov. 5th, 8.15am. Make sure your compiler throws errors and terminates using the methods in the Reporter
trait when it encounters problems.