LARA

Labs 07: Mini Project

This project has several deadlines to ensure that the project extension you pick is reasonable. Please read below.

You have now written a compiler for a simple language. The final mini-project is to design an implement a new functionality of your own choice. Typically you will extend the compiler you built so far with a new feature. In preparation for this, you should aim to learn about the problem domain by searching the appropriate literature. The mini project includes:

  • designing and implementing the new functionality
  • documenting the results in a written report document

Selecting a Project Topic

Deadline: 11 December 2013

We next list several ideas, but you should also feel free to submit your own by email. All groups will rank the mini-projects in order of preference on the git repository server ("Mini Projects" tab), and we will then apply a matching algorithm to assign them.

Assignments

    group02 : Your own Idea
    group03 : LLVM: Generate LLVM Intermediate Representation
    group04 : Overloading and Multi-methods
    group05 : Virtual Machine
    group06 : REPL
    group07 : JVM Link: Interface with Other Classes on the JVM
    group08 : Generics: Type-Parametric Programming
    group09 : Macros
    group10 : First-class Functions
    group11 : Alternative Frontend
    group12 : Graphics Primitives
    group13 : Your Own Idea
    group14 : Your own idea
    group15 : Improved Arguments and Fields
    group16 : Abstract Interpretation: Prevent More Errors
    group17 : Liberal Syntax
    group18 : Continuation Passing Transformation
    group19 : Traits: Multiple Inheritance
    group20 : Type Inference: Write Less Type Annotations
    group21 : LLVM
    group22 : Exceptions
    group24 : ADTs with Pattern-Matching
    group26 : Garbage Collection
    group28 : Optimizations: Tail-Calls Elimination
    group29 : Generators / Co-Routines

Project Proposal

Deadline: 18 December 2013

When the bidding phase is over and a mini-project has been assigned to your group, you should prepare a short report describing:

  • a basic description of the features you want to add to the compiler and/or to the language
  • some (short) programs highlighting the use of these features, with a description of how your compiler currently behaves on them (if it supports them at all), and how the improvements will change that behaviour
  • a sketch of the changes you will make to each compiler phase and of the potential new phases you'll have
  • ideas for further extensions

The third point should not be a complete implementation plan, but it should demonstrate that you understand how to tackle the problem you chose. You will (in principle) not have to implement what you describe in the fourth point. You will only be keeping this in mind so as to avoid running out of work in case the project turns out to be simpler that expected (or for extra credit, if you happened to solve the hard problems too quickly).

We will discuss the ideas with the groups during the lab session and help you prepare your proposal. Please make sure that you have selected a project and submitted a short proposal in time. At the end of the project you will have to write a report on your compiler extension, and (part of) this proposal will serve as its introduction. The proposal and the report are to be submitted through Git and using the git repository server.

List of Ideas

Remember that the following list of ideas is not exhaustive. You can (and are encouraged to) submit your own ideas for discussion. You can even propose something and set it as your second best choice in case your preferred mini project cannot be assigned to you.

Your Own Idea™

The great idea you proposed and discussed with us.

Optimization: tail-call elimination

Allow multiple return locations within methods. Implement tail-call elimination so that tail-recursive methods can be implemented efficiently without using too much stack space.

class Helper {
  def last(a: List) = {
    if (a.hasNext()) {
       return last(a.next());
    } else {
       return a;
    }
  }
}

Garbage Collection

Write a pretty printer to output C code and accompany it with a garbage collector. That way you will be able to generate code that can be compiled with any C compiler into binaries that will be memory-efficient.

You can implement the kind of GC you want, like mark-and-sweep or copying. It does not need to run in parallel.

Generics: Type-Parametric Programming

Introduce a form of generics into the language. Generics are a way to write functions that are reusable with different types. You could write code such as:

class LinkedList[T1] {
  var next: LinkedList[T1];
 
  ...
 
  def addElement(elem: T1): Bool = { ... }
 
  def at(index: Int): T1 = { ... }
}

LLVM: Generate LLVM Intermediate Representation

LLVM is a fast-developing compiler platform. At is core is an intermediate representation language and a suite of compilers from that language to, e.g., native code. The goal of this project is to target LLVM as a new backend for your Tool compiler.

Traits: Multiple Inheritance

Implement support for traits, as in Scala, and add support for abstract methods. Traits allow programmers to use multiple inheritance and have a clear mechanism for resolving conflicts that typically arise in diamond-shaped inheritance graphs.

JVM Link: Interface with Other Classes on the JVM

Add support for “foreign” class and function declarations, so that you can use arbitrary Java classes in your code. For instance, you could write:

foreign class java.util.Random as JRandom {
  def nextInt() : Int;
}

…to introduce the class JRandom with the method nextInt(). You will have to search the classpath at compile-time for classes exist that match the interface, for instance using BCEL.

Abstract Interpretation: Prevent More Errors

Design and implement a static analysis of Tool programs that would detect programatical errors reliably at compilation.

Here are some exmaples of the things you could check:

  • variables are always initialized (makes the JVM happier)
  • null dereference
  • out of bounds array accesses
  • division by zero

For this you will need to implement a control-flow analysis and understand abstract interpretation.

Type Inference: Write Less Type Annotations

Implement support for type inference, so that users do not need to specify (all) types. Because the language is simpler, you can go beyond what Scala supports: infer types for method arguments, return types, fields, local variables. Issue warnings or errors when types could not be inferred.

class TypeUser {
  def run(a: Int) = {
     val b = this.identity(a);
     return b;
  }
 
  def identity(a) = { return a; }
}

First-class Functions: Functions are Values

Introduce higher-order functions into the language, including anonymous function declarations. Allow functions to be passed as arguments, stored in variables, etc. You will need to extend your type system and learn about lambda lifting.

class List {
  // ...
 
  def forall(t: Int => Bool): Bool = {
     var c = this;
     var res = true;
     while(c.hasNext() && res) {
       if (!t(c.value()) {
         res = false;
       }
       c = c.next();
     }
     return res;
  }
 
  def test() = {
     this.forall( (x: Int) => { return x > 2; });
  }
 
  def identity(a) = { return a; }
}

Overloading and Multi-methods

Implement support for method overloading by argument count and type and for multi-methods.

Exceptions

Introduce throwing and catching exceptions, as well as a way to declare and infer them. Among other things, you will need to (slightly) modify the cafebabe library for this. We'll help you.

ADTs and Pattern Matching

Introduce algebraic data types (ADTs, similar to Scala case classes) and pattern matching into the compiler. You can start with lists and extend it to user-defined recursive types (like your ASTs).

You will need to add support for the declaration of ADTs, so that you can write, for instance:

  type List = Nil | Cons(head: Int, tail: List)
  type Tree = Leaf | Node(left: Tree, value: Int, right: Tree)

ADTs are immutable tree structures that are the core data types of most functional programming languages. Typically, you program with them by using pattern-matching:

def size(t: Tree) : Int = {
  return (tree match {
    case Leaf => 0
    case Node(left, _, right) => 1 + size(left) + size(right)
  });
}

For the optional part, you can implement an optimal matching algorithm, check that pattern-matching expressions cover all possible cases, or that all patterns are reachable.

Liberal Syntax

Extend the syntax to allow:

  • Semicolon inference
  • Methods as infix operators
  • Expressions as statements
  • Operators as methods

This will allow you to rewrite code that looks like

def matrixLinearCombination(xa : Int, ma : Matrix, xb : Int, mb : Matrix) : Matrix = {
  var result : Matrix;
  result = ma.times(xa).plus(mb.times(xb));
  return result;
}

into

def matrixLinearCombination(xa : Int, ma : Matrix, xb : Int, mb : Matrix) : Matrix = {
  var result : Matrix
  result = ma * xa + mb * xb
  return result
}

Macros

Implement a mechanism within the compiler to allow a Tool user to specify compiler macros, which will give users the ability to control the generated trees. For instance, you could consider a branching macro that would select a certain version of the code depending on certain static values.

The Macro expansion should be implemented as a phase within your compiler, after parser.

C Virtual Machine

Write a simple virtual machine (typically in C, but you can choose Scala if you prefer), and replace your code generation phase to target your virtual machine. a new back-end to target this machine. Keep the format simpler than class files so make the project feasible. Agree with us on a list of op-codes to implement.

REPL: Read-Eval-Print-Loop

Implement a REPL for the Tool language. Make sure your REPL is able to handle redefinitions. You can take a look at the Scala REPL for inspiration.

Extend the language with Graphics/Interactive Primitives

Add primitives to do graphics in toolc. Enable drawing shapes, choose colors and displaying bitmaps. Test primitives to build a game such as pong or similar. Design friendly language syntax to describe graphical elements.

You also need to add primitives for interacting with the user via the keyboard or mouse. The language should be extended in an elegant way for graphical application developers.

Improved Arguments and Fields

Implement named arguments with potential default values. Allow default values to class members as well. The default values can be arbitrary Tool expressions.

Implement a mechanism for providing an arbitrary number of arguments to a function.

Provide Tool classes with an extra copy() method that allows a quick cloning of an object while specifying which field to modify (Similar to what Scala's copy() for case classes)

Generators - Co-Routines

Allow more general change of control between procedures, as appropriate, for example, for for-statements that have 'yield' conditions as well as more generally, co-routines, a form of cooperative concurrency, where a procedure explicitly decides when to suspend its execution to allow others to resume.

See http://en.wikipedia.org/wiki/Iterator

Continuation Passing Transformation

Compile programs by systematically introducing the ability to manipulate the labels in which execution will continue, as well as the current state. Such description of future computation is called continuation. Continuations can in principle be stored and executed later like functions (closures).

Alternative Syntax using Parser Generator

Implement alternative front-end for Tool, using a parser generator such as

http://bnfc.digitalgrammars.com/

Final Compiler and Report

Deadline: 10 January 2014 23h59

Your final report is due on the date, and both will be delivered using Git.

You are encouraged to use the following (LaTeX) template for your report:

A PDF version of the template with the required section is available here:

Although you are not required to use this template, your report must contain at least the sections described in it with the appropriate information. Note that writing this report will take some time and that you should not do it at the last minute. This final report is an important part of the compiler project. If you have questions about the template or the contents of the report, make sure you ask them early.

A common question is “how long should the report be?”. There's no definitive answer to that. Considering that the report will contain code examples and a fairly precise description of implementation, it should probably be at least 3 pages. Less would be suspicious. 6 pages is fine. We will not complain for up to 10 pages if the content is meaningful, but extra pages by themselves do not make a report better, so please use as much as you need.