Labs 07: Mini Project

This project has several deadlines to ensure that the project extension you pick is reasonable. Please read below.

You have now written a compiler for a simple language. The final mini-project is to design an implement a new functionality of your own choice. Typically you will extend the compiler you built so far with a new feature. In preparation for this, you should aim to learn about the problem domain by searching the appropriate literature. The mini project includes:

designing and implementing the new functionality
documenting the results in a written report document

This will be a team project for groups of 1-3. You can work on top of any of the group's repositories.

Selecting a Project Topic

Deadline: Friday, 9 December 2016

We next list several ideas, but you should also feel free to submit your own by email. All groups will rank the mini-projects in order of preference, and we will then apply a matching algorithm to assign them.

Send to Manos (emmanouil.koukoutos@epfl.ch) an e-mail titled “Project selection: groups n1[,n2[,n3]]” (where ni your group numbers), containing the top 5 projects you would like to do, in order. We will do our best to distribute projects to groups.

Project Proposal

Deadline: 21 December 2016 (maybe 19 Dec. for a few groups)

When the bidding phase is over and a mini-project has been assigned to your group, you should prepare a 5 minutes presentation (max 5 slides) describing:

a basic description of the features you want to add to the compiler and/or to the language
some (short) programs highlighting the use of these features, with a description of how your compiler currently behaves on them (if it supports them at all), and how the improvements will change that behavior
a sketch of the changes you will make to each compiler phase and of the potential new phases you'll have

The third point should not be a complete implementation plan, but it should demonstrate that you understand how to tackle the problem you chose.

We will discuss the ideas with the groups during the lab session and help you prepare your proposal. Please make sure that you have selected a project and submitted a short proposal in time. You will present your idea during the lab sessions on 21/12 (and maybe 19/12 if we have too many groups). We will ask that you send us our slides the night before so we streamline the process.

At the end of the project you will have to write a report on your compiler extension, and (part of) this proposal will serve as its introduction. The proposal and the report are to be submitted through Git and using the git repository server.

List of Ideas

Below we list some ideas for projects. There are a few easy projects that are targeted to single-person groups (as indicated by the project name). Still, the projects vary in difficulty, so we will be more demanding from the easier ones. Remember that the following list of ideas is not exhaustive. In fact, you are encouraged to submit your own ideas for discussion. You can even propose something and set it as your second best choice in case your preferred mini project cannot be assigned to you.

Your Own Idea™

The great idea you proposed and discussed with us.

Optimization: tail-call elimination

Allow multiple return locations within methods. Implement tail-call elimination so that tail-recursive methods can be implemented efficiently without using too much stack space.

class Helper {
  def last(a: List) = {
    if (a.hasNext()) {
       return last(a.next());
    } else {
       return a;
    }
  }
}

To simplify your task, you only need to optimize simple recursive tail calls (not mutual recursion).

Output C code

Write a pretty printer to output C code. This will replace the default Tool code generation. Things you need to consider are:

How do you represent objects?
How do you encode inheritance and dynamic dispatch? You may need to use casts in C.

Output Javascript code

Write a pretty printer to output Javascript code. This will replace the default Tool code generation. The latest Javascript standard should be quite a convenient platform to output code.

Generics: Type-Parametric Programming

Introduce a form of generics into the language. Generics are a way to write functions that are reusable with different types. You could write code such as:

class LinkedList[T1] {
  var next: LinkedList[T1];
 
  ...
 
  def addElement(elem: T1): Bool = { ... }
 
  def at(index: Int): T1 = { ... }
}

Value classes

Add support for value classes in Tool, like in Scala. Value classes contain only one field and are represented only by this field, which is allocated on the stack. No subtyping is supported. You need to consider how methods of a value class can be implemented.

Traits: Multiple Inheritance

Add support multiple inheritance using traits and abstract methods in Tool, in the same way that Scala does. To simplify the project, we will forbid traits from defining fields. We need to define a clear mechanism to for resolving conflicts that typically arise in diamond-shaped inheritance graphs. This will be important for code generation; see e.g. this link for an idea about how Scala does it.

JVM Link: Interface with Other Classes on the JVM

Add support for “foreign” class and function declarations, so that you can use arbitrary Java classes in your code. For instance, you could write:

foreign class java.util.Random as JRandom {
  def nextInt(): Int;
}

…to introduce the class JRandom with the method nextInt(). You will have to search the classpath at compile-time for classes that match the interface (if any), for instance using BCEL.

Type Inference: Write Less Type Annotations

Implement support for type inference, so that users do not need to specify (all) types. Infer types for return types, class fields, local variables. Issue warnings or errors when types could not be inferred.

class TypeUser {
  def run(a: Int) = {
     val b = this.identity(a);
     return b;
  }
 
  def identity(a: Int) = { return a; }
}

First-class Functions: Functions are Values

Introduce higher-order functions into the language, including anonymous function declarations. Allow functions to be passed as arguments, stored in variables, etc. You will need to extend your type system and learn about lambda lifting. To make things easier, we forbid a lambda to assign variables it can access in its environment (it should be allowed to access them) (Bonus: Lift this restriction!).

class List {
  // ...
 
  def forall(t: Int => Bool): Bool = {
     var c = this;
     var res = true;
     while(c.hasNext() && res) {
       if (!t(c.value()) {
         res = false;
       }
       c = c.next();
     }
     return res;
  }
 
  def test() = {
     this.forall( (x: Int) => { return x > 2; });
  }
 
}

Overloading (1 Person)

Implement support for method overloading by argument count and type.

ADTs and Pattern Matching

Introduce algebraic data types (ADTs, similar to Scala case classes) and pattern matching into the compiler. ADTs are immutable tree structures that are the core data types of most functional programming languages.

You will need to add support for the declaration of ADTs, so that you can write, for instance:

  type List = Nil | Cons(head: Int, tail: List)
  type Tree = Leaf | Node(left: Tree, value: Int, right: Tree)

Typically, you program with them by using pattern-matching:

def size(t: Tree) : Int = {
  return (tree match {
    case Leaf => 0
    case Node(left, _, right) => 1 + size(left) + size(right)
  });
}

Liberal Syntax

Extend the syntax to allow:

Semicolon inference
Methods as infix operators

This will allow you to rewrite code that looks like

def matrixLinearCombination(xa : Int, ma : Matrix, xb : Int, mb : Matrix) : Matrix = {
  var result : Matrix;
  result = ma.times(xa).plus(mb.times(xb));
  return result;
}

into

def matrixLinearCombination(xa : Int, ma : Matrix, xb : Int, mb : Matrix) : Matrix = {
  var result : Matrix
  result = ma * xa + mb * xb
  return result
}

Macros

Implement a mechanism within the compiler to allow a Tool user to specify compiler macros, which will give users the ability to control the generated trees. For instance, you could consider a branching macro that would select a certain version of the code depending on certain static values.

The Macro expansion should be implemented as a phase within your compiler, after parser.

Parser combinators (1 Person)

Reimplement the Tool lexer and parser with a parser combinator library, e.g. https://github.com/lihaoyi/fastparse.

REPL: Read-Eval-Print-Loop

Implement a REPL for the Tool language. It should support defining classes and (top-level) functions and evaluating expressions. You don't have to support redefinitions. You can take a look at the Scala REPL for inspiration.

Improved Method Arguments & Class Fields

Implement named arguments with potential default values. The default values can be arbitrary Tool expressions.

Implement a mechanism for providing an arbitrary number of arguments to a method.

Allow default values to class members. The default values can be arbitrary Tool expressions.

Provide Tool classes with an extra copy() method that allows a quick cloning of an object while specifying which field to modify (Similar to what Scala's copy() for case classes)

Improved arrays (1 Person)

Improve Tool arrays by adding:

Support for arrays of any type.
Java array literals, e.g. var numbers: Int[]; numbers = {10, 20, 30, 40, 50};
Improve Array API by adding methods like fill, copyOf, sort, toString similar to Java. They have to be implemented as Tool libraries or in JVM bytecode (don't just use Java/Scala libraries).

Packages

Introduce support for packages and qualified names in Tool. You will need to support parsing of multiple files to make it more useful.

Implicit parameters (1 Person)

Add support for implicit parameters to Tool, much like in Scala. A method that expects an implicit parameter (which is not explicitly given) should look for it in its (static) scope. If two implicit parameters are found in scope then the one with the most accurate type is chosen. If both have the same type or uncomparable types, then compilation fails. If no implicit variables of the right type are found in the scope, compilation fails.

Implicit classes

Add support for implicit classes in Tool, just like in Scala.

The following example class represents a range of values:

class Range {
  var from: Int;
  var to: Int;
  def setFrom(x: Int): Range = {
    from = x;
    return this;
  }
  def setTo(x: Int): Range = {
    // do some args verification ...
    to = x;
    return this;
  }
}

We want to be able to construct a range with an intuitive syntax, like

1.to(5)

This should be possible with the introduction of the following implicit class:

implicit class IntToRange(start: Int) {
  def to(end: Int): Range = {
    var range: Range;
    return (new Range().setFrom(start).setTo(end));
  }
}

Tuples (1 Person)

Add support for Tuples to Tool, like in Scala. You should support Tuples of any length. You need to support tuple types e.g. a: (Int, Int), tuple literals e.g. (x, y) and tuple selectors e.g. tuple._1.

Call-by-name parameters

Add call-by-name parameters to Tool, similar to Scala:

def implies(l: Boolean, r: => Boolean): Boolean = !l || r

By-name parameters do not get evaluated before the function is called, but only when they are used. E.g. if

def foo(): Boolean = { println("foo"); return true; }
def bar(): Boolean = { println("bar"); return true; }
def strange(b1: => Boolean, b2: => Boolean): Boolean = { return (b1 && !b1 && b2); }

then strange(foo(), bar()) would print

foo
foo

To simplify matters, by-name parameters will not be allowed to modify variables they capture in their scope. Bonus: Lift this restriction!

String interpolation (1 Person)

Implement string interpolation in Tool, like in Scala. For example, if a=1 and b=2, then

s"Is $a > $b? ${a > b}"

is equal to

"Is 1 > 2? false"

Abstract Interpretation (multiple projects)

Design and implement a static analysis of Tool programs that would detect programmatic errors reliably at compilation.

First, you will need to generate control-flow graphs, then implement a specific abstract interpreter using these graphs. We can constrain the project to intraprocedural analysis, talk to us for details.

For example, your abstract interpreter could

check if variables are always initialized (makes the JVM happier)
do range computation (apply on out of bounds array accesses check). The precision of your analysis is important.
eliminate common subexpressions (see book “Compilers: Principles, Techniques, and Tools”)
do live variable analysis. Apply it to eliminate dead code.
do reaching definition analysis. Apply it on loop-invariant code motion

Exceptions (Hard!)

Introduce throwing and catching exceptions, as well as a way to declare and infer them. You will need to add support for exceptions to the cafebabe library.

Virtual Machine

Write a simple virtual machine which interprets the bytecode emitted by the tool compiler.

Normal version: Write your VM in Scala and interpret the internal data structures used by cafebabe.
Hard version: Write your VM in C (or Scala if you really want) and interpret the .class files.
No Christmas version: Design your own bytecode format, write a backend for the tool compiler that targets it, and write a VM to interpreter.

Scala-style macros (Hard!)

Add support for Scala-style (syntax based) macros in Tool. You will need to implement a reflection library for Tool, and a system to expand these macros in compile time.

Final Compiler and Report

Deadline: 13 January 2017 23h59

Your final report is due on the date, and both will be delivered using Git.

You are encouraged to use the following (LaTeX) template for your report:

LaTeX sources

A PDF version of the template with the required section is available here:

PDF Example

Although you are not required to use this template, your report must contain at least the sections described in it with the appropriate information. Note that writing this report will take some time and that you should not do it at the last minute. This final report is an important part of the compiler project. If you have questions about the template or the contents of the report, make sure you ask them early.

A common question is “how long should the report be?”. There's no definitive answer to that. Considering that the report will contain code examples and a fairly precise description of implementation, it should probably be at least 3 pages. Less would be suspicious. 6 pages is fine. We will not complain for up to 10 pages if the content is meaningful, but extra pages by themselves do not make a report better, so please use as much as you need.