LARA

Labs 05

This week you will implement type checking in your Tool compiler. After this step, you will have completed the front-end of your compiler. This means that it will be able to reject all invalid programs, and accept all valid programs. You will then be able to turn these valid inputs into assembly code that runs on the JVM, just like javac and scalac do. Isn't that great? We think it's great.

Type checking

A valid Tool program has the following properties:

  • It follows the Tool concrete syntax.
  • It respects all the constraints mentioned in Labs 04.
  • Method overriding respects some typing constraints:
    • The overriding method must have exactly as many arguments as the overridden one.
    • The types of the parameters in the overriding and overridden methods must match exactly (no contravariance allowed).
    • The return type must match exactly (no covariance allowed, as opposed to Java, for instance). Note however that the return expression of the method is allowed to have a subtype of the declared return type.
  • All expressions typecheck and have the expected type (the returned expression matches the declared return type, for instance).
  • All statements typecheck.

Your goal in this assignment is to enforce all the constraints not enforced already by the previous phases.

Note: The language and the type rules presented in the course may differ from the rules of Tool. If there are any differences, please use the description on the current page for your implementation, and not the rules in the lecture. Of course, feel free to clarify with us if you have any changes.

Types

The following primary types exist in Tool (note that we prefix them with T to differentiate them from the tree nodes with the same name, for instance):

  • TInt
  • TBoolean
  • TInt[]
  • TString 1)

Additionally, we have object types:

  • TClass[name]

We define a subtyping relation on these types. All primary types are subtypes of themselves and of no other type. For instance:

  • TInt <: TInt

All object types are subtypes of themselves and the special “Object” object type. The subtyping relation is also transitive.

  • TClass[name] <: TClass[name] and TClass[name] <: TClass[Object]
  • TClass[B] <: TClass[A] and TClass[C] <: TClass[B] implies TClass[C] <: TClass[A]

With this in mind, we give some of the non-trivial typing constraints. This is naturally not an exhaustive list of what you should implement, but we expect you to be able to deduce the other rules unambiguously yourself (if in doubt about a rule, ask on the Moodle forum).

Overloaded "+"

The “+” operator can represent integer addition, or string concatenation. If the types of e1 and e2 are T1 and T2 respectively, we have for the type Ts of e1 + e2:

  • T1 = TInt and T2 = TIntTs = TInt
  • T1 = TString and T2 = TIntTs = TString
  • T1 = TInt and T2 = TStringTs = TString
  • T1 = TString and T2 = TStringTs = TString

All other values for T1 and T2 should result in type errors.

Comparison operator

The “==” operator is also overloaded. Expression e1==e2 is type correct if and only if one of the following two cases apply:

  • The types of e1 and e2 are both class types (in which case they can be different classes)
  • The types of e1 and e2 are equal.

Note that it is not type correct to compare a primitive type to a class type.

Method calls

The expression on which the method is called (e.g. e in e.m(args)) must be of an object type, and its class must declare the method called. The number of arguments must of course match the number of parameters. The arguments passed must be subtypes of the declared parameters (matching one-by-one).

Assignment

Assignment of an expression of type T can only be done to a variable of type S such that T <: S.

Assignment to array elements can only be done through an array variable, and the index must be an integer expression. The assigned value must be an integer.

This

this is always considered to carry the object type corresponding to the class where it occurs.

Returned expression

The returned expression must be of a subtype of the declared return type.

Printing out

We will consider println calls to be valid if the argument is an integer, a Boolean or a string.

Stubs and suggested implementation

This week adds or updates the following files:

Types.scala includes the definitions of types. Along with the Tool types, the following special types are included:

  • TUntyped is the initial type for the symbols before you set the type. Obviously it should be absent from any valid program after TC.
  • TError represents the erroneous type for an expression that does not typecheck. For example the type of “foo” - 1 will compute to TError. This should also be absent from any valid program after TC.
  • TObject represents the top of the class hierarchy (like Object in Java) as described above.

Notice that the types also define the subtyping relation. Your first task is to complete the definition for TClass.

This file also defines the Typed trait which will be attached to classes which define a type.

Symbols.scala has been updated to include types of Symbols. The type of a Symbol gets initialized to TUntyped and has to be set later. An exception to this is the ClassSymbol.

Trees.scala has been updated to include types of expressions. Some expressions always have the same type (like Minus), which is then defined as a val. For the rest, the type depends on their subexpressions and can change when we set symbol types in the program, so it is a def.

Your second task is to complete the definition of getType for the missing cases.

NameAnalysis.scala has not changed; however you have to update it now that we have types. Your third task is to modify NameAnalysis so that

  • it attaches types to the various symbols. Since symbols are shared, this has the advantage that you can recover the proper type from any occurrence of the symbol.
  • it enforces the overriding type constraints on methods.

TypeChecking.scala: Finally your fourth task is to implement the typechecker. The main method here is tcExpr which traverses the tree and checks that types of expressions match the expected types (see code). While you typecheck expressions, you also have to attach the proper symbols to the occurrences of method calls (since you can now determine the class from which they are called). You may also have to impose some extra typing rules on some expressions, like MethodCall and Equals.

User-friendliness

It is very important that your compiler does not stop at the first type error! Use ctx.reporter.error to report as many type errors as possible!

Stubs

As usual, merge the stubs of the parser into your main branch by typing

git fetch –all git merge origin/Lab05

(assuming that origin is the name of your remote repository).

It should merge the following files in your project. If you get conflicts, don't panic, they should be relatively easy to resolve.

toolc
  ├── Main.scala                (updated this week)
  │
  ├── lexer
  │    ├── Lexer.scala      
  │    └── Tokens.scala
  │
  ├── eval                     
  │    └── Evaluator.scala
  │
  ├── analyzer
  │     ├── Symbols.scala       (updated this week)
  │     ├── NameAnalysis.scala
  │     ├── TypeChecking.scala  (stubs provided this week)
  │     └── Types.scala         (stubs provided this week)
  ├── ast
  │    ├── ASTConstructor.scala
  │    ├── ASTConstructorLL1.scala
  │    ├── Parser.scala
  │    ├── Printer.scala
  │    └── Trees.scala          (updated this week)
  │
  └── utils
       ├── Context.scala
       ├── Positioned.scala
       ├── Reporter.scala
       └── Pipeline.scala

Deliverables

As usual, please choose a commit from your git repository as a deliverable on our server before Tuesday, Nov. 29nd, 23h59


1)
We consider String to be a primary type, unlike in Java where it is a proper class. No methods can be called on Strings, for instance.