Labs 07: Mini Project
This project has several deadlines to ensure that the project extension you pick is reasonable. Please read below.
You have now written a compiler for a simple language. The final mini-project is to design an implement a new functionality of your own choice. Typically you will extend the compiler you built so far with a new feature. In preparation for this, you should aim to learn about the problem domain by searching the appropriate literature. The mini project includes:
- designing and implementing the new functionality
- documenting the results in a written report document
Selecting a Project Topic
Deadline: 11 December 2013
We next list several ideas, but you should also feel free to submit your own by email. All groups will rank the mini-projects in order of preference on the git repository server ("Mini Projects" tab), and we will then apply a matching algorithm to assign them.
Assignments
group02 : Your own Idea group03 : LLVM: Generate LLVM Intermediate Representation group04 : Overloading and Multi-methods group05 : Virtual Machine group06 : REPL group07 : JVM Link: Interface with Other Classes on the JVM group08 : Generics: Type-Parametric Programming group09 : Macros group10 : First-class Functions group11 : Alternative Frontend group12 : Graphics Primitives group13 : Your Own Idea group14 : Your own idea group15 : Improved Arguments and Fields group16 : Abstract Interpretation: Prevent More Errors group17 : Liberal Syntax group18 : Continuation Passing Transformation group19 : Traits: Multiple Inheritance group20 : Type Inference: Write Less Type Annotations group21 : LLVM group22 : Exceptions group24 : ADTs with Pattern-Matching group26 : Garbage Collection group28 : Optimizations: Tail-Calls Elimination group29 : Generators / Co-Routines
Project Proposal
Deadline: 18 December 2013
When the bidding phase is over and a mini-project has been assigned to your group, you should prepare a short report describing:
- a basic description of the features you want to add to the compiler and/or to the language
- some (short) programs highlighting the use of these features, with a description of how your compiler currently behaves on them (if it supports them at all), and how the improvements will change that behaviour
- a sketch of the changes you will make to each compiler phase and of the potential new phases you'll have
- ideas for further extensions
The third point should not be a complete implementation plan, but it should demonstrate that you understand how to tackle the problem you chose. You will (in principle) not have to implement what you describe in the fourth point. You will only be keeping this in mind so as to avoid running out of work in case the project turns out to be simpler that expected (or for extra credit, if you happened to solve the hard problems too quickly).
We will discuss the ideas with the groups during the lab session and help you prepare your proposal. Please make sure that you have selected a project and submitted a short proposal in time. At the end of the project you will have to write a report on your compiler extension, and (part of) this proposal will serve as its introduction. The proposal and the report are to be submitted through Git and using the git repository server.
List of Ideas
Remember that the following list of ideas is not exhaustive. You can (and are encouraged to) submit your own ideas for discussion. You can even propose something and set it as your second best choice in case your preferred mini project cannot be assigned to you.
Your Own Idea™
The great idea you proposed and discussed with us.
Optimization: tail-call elimination
Allow multiple return locations within methods. Implement tail-call elimination so that tail-recursive methods can be implemented efficiently without using too much stack space.
class Helper { def last(a: List) = { if (a.hasNext()) { return last(a.next()); } else { return a; } } }
Garbage Collection
Write a pretty printer to output C code and accompany it with a garbage collector. That way you will be able to generate code that can be compiled with any C compiler into binaries that will be memory-efficient.
You can implement the kind of GC you want, like mark-and-sweep or copying. It does not need to run in parallel.
Generics: Type-Parametric Programming
Introduce a form of generics into the language. Generics are a way to write functions that are reusable with different types. You could write code such as:
class LinkedList[T1] { var next: LinkedList[T1]; ... def addElement(elem: T1): Bool = { ... } def at(index: Int): T1 = { ... } }
LLVM: Generate LLVM Intermediate Representation
LLVM is a fast-developing compiler platform. At is core is an intermediate representation language and a suite of compilers from that language to, e.g., native code. The goal of this project is to target LLVM as a new backend for your Tool compiler.
Traits: Multiple Inheritance
Implement support for traits, as in Scala, and add support for abstract methods. Traits allow programmers to use multiple inheritance and have a clear mechanism for resolving conflicts that typically arise in diamond-shaped inheritance graphs.
JVM Link: Interface with Other Classes on the JVM
Add support for “foreign” class and function declarations, so that you can use arbitrary Java classes in your code. For instance, you could write:
foreign class java.util.Random as JRandom { def nextInt() : Int; }
…to introduce the class JRandom with the method nextInt(). You will have to search the classpath at compile-time for classes exist that match the interface, for instance using BCEL.
Abstract Interpretation: Prevent More Errors
Design and implement a static analysis of Tool programs that would detect programatical errors reliably at compilation.
Here are some exmaples of the things you could check:
- variables are always initialized (makes the JVM happier)
- null dereference
- out of bounds array accesses
- division by zero
For this you will need to implement a control-flow analysis and understand abstract interpretation.
Type Inference: Write Less Type Annotations
Implement support for type inference, so that users do not need to specify (all) types. Because the language is simpler, you can go beyond what Scala supports: infer types for method arguments, return types, fields, local variables. Issue warnings or errors when types could not be inferred.
class TypeUser { def run(a: Int) = { val b = this.identity(a); return b; } def identity(a) = { return a; } }
First-class Functions: Functions are Values
Introduce higher-order functions into the language, including anonymous function declarations. Allow functions to be passed as arguments, stored in variables, etc. You will need to extend your type system and learn about lambda lifting.
class List { // ... def forall(t: Int => Bool): Bool = { var c = this; var res = true; while(c.hasNext() && res) { if (!t(c.value()) { res = false; } c = c.next(); } return res; } def test() = { this.forall( (x: Int) => { return x > 2; }); } def identity(a) = { return a; } }
Overloading and Multi-methods
Implement support for method overloading by argument count and type and for multi-methods.
Exceptions
Introduce throwing and catching exceptions, as well as a way to declare and infer them. Among other things, you will need to (slightly) modify the cafebabe library for this. We'll help you.
ADTs and Pattern Matching
Introduce algebraic data types (ADTs, similar to Scala case classes) and pattern matching into the compiler. You can start with lists and extend it to user-defined recursive types (like your ASTs).
You will need to add support for the declaration of ADTs, so that you can write, for instance:
type List = Nil | Cons(head: Int, tail: List) type Tree = Leaf | Node(left: Tree, value: Int, right: Tree)
ADTs are immutable tree structures that are the core data types of most functional programming languages. Typically, you program with them by using pattern-matching:
def size(t: Tree) : Int = { return (tree match { case Leaf => 0 case Node(left, _, right) => 1 + size(left) + size(right) }); }
For the optional part, you can implement an optimal matching algorithm, check that pattern-matching expressions cover all possible cases, or that all patterns are reachable.
Liberal Syntax
Extend the syntax to allow:
- Semicolon inference
- Methods as infix operators
- Expressions as statements
- Operators as methods
This will allow you to rewrite code that looks like
def matrixLinearCombination(xa : Int, ma : Matrix, xb : Int, mb : Matrix) : Matrix = { var result : Matrix; result = ma.times(xa).plus(mb.times(xb)); return result; }
into
def matrixLinearCombination(xa : Int, ma : Matrix, xb : Int, mb : Matrix) : Matrix = { var result : Matrix result = ma * xa + mb * xb return result }
Macros
Implement a mechanism within the compiler to allow a Tool user to specify compiler macros, which will give users the ability to control the generated trees. For instance, you could consider a branching macro that would select a certain version of the code depending on certain static values.
The Macro expansion should be implemented as a phase within your compiler, after parser.
C Virtual Machine
Write a simple virtual machine (typically in C, but you can choose Scala if you prefer), and replace your code generation phase to target your virtual machine. a new back-end to target this machine. Keep the format simpler than class files so make the project feasible. Agree with us on a list of op-codes to implement.
REPL: Read-Eval-Print-Loop
Implement a REPL for the Tool language. Make sure your REPL is able to handle redefinitions. You can take a look at the Scala REPL for inspiration.
Extend the language with Graphics/Interactive Primitives
Add primitives to do graphics in toolc. Enable drawing shapes, choose colors and displaying bitmaps. Test primitives to build a game such as pong or similar. Design friendly language syntax to describe graphical elements.
You also need to add primitives for interacting with the user via the keyboard or mouse. The language should be extended in an elegant way for graphical application developers.
Improved Arguments and Fields
Implement named arguments with potential default values. Allow default values to class members as well. The default values can be arbitrary Tool expressions.
Implement a mechanism for providing an arbitrary number of arguments to a function.
Provide Tool classes with an extra copy() method that allows a quick cloning of an object while specifying which field to modify (Similar to what Scala's copy() for case classes)
Generators - Co-Routines
Allow more general change of control between procedures, as appropriate, for example, for for-statements that have 'yield' conditions as well as more generally, co-routines, a form of cooperative concurrency, where a procedure explicitly decides when to suspend its execution to allow others to resume.
Continuation Passing Transformation
Compile programs by systematically introducing the ability to manipulate the labels in which execution will continue, as well as the current state. Such description of future computation is called continuation. Continuations can in principle be stored and executed later like functions (closures).
Alternative Syntax using Parser Generator
Implement alternative front-end for Tool, using a parser generator such as
Final Compiler and Report
Deadline: 10 January 2014 23h59
Your final report is due on the date, and both will be delivered using Git.
You are encouraged to use the following (LaTeX) template for your report:
A PDF version of the template with the required section is available here:
Although you are not required to use this template, your report must contain at least the sections described in it with the appropriate information. Note that writing this report will take some time and that you should not do it at the last minute. This final report is an important part of the compiler project. If you have questions about the template or the contents of the report, make sure you ask them early.
A common question is “how long should the report be?”. There's no definitive answer to that. Considering that the report will contain code examples and a fairly precise description of implementation, it should probably be at least 3 pages. Less would be suspicious. 6 pages is fine. We will not complain for up to 10 pages if the content is meaningful, but extra pages by themselves do not make a report better, so please use as much as you need.