LARA

Labs 02

Part 1: Introduction to the Tool Compiler Project

Familiarize yourself with the Tool Programming Language and the Tool Compiler Project. Write two example Tool programs each (4 per group of 2) and make sure you can compile them using the Tool Reference Compiler. Please be creative when writing your programs. We don't need 5 versions of a program computing the Fibonacci sequence. The examples at the end of the Tool page should convince you that you can write interesting programs.

Remember that you will use these programs in the remaining of the semester to test your compiler, so don't make them too trivial! Try to test many features of the language.

Deliverable

Please choose a commit from your git repository as a deliverable on our server before Tuesday, Oct. 4th, 11.59pm (23h59). We'll simply look for all .tool files (and ignore what looks like copies of our own programs).

Part 2: Lexer for Tool

This assignment is the first real part of the Tool compiler project. Make sure you have read the general project overview page first.

As testcases, you can start with the ones we provide. After the first week, when Part 1 is completed, we'll also compile and send you a link to all the testcases you and the other students in the class will have written.

Lexer

Write the lexer for Tool. Here are some details you should pay attention to:

  • Make sure you recognize keywords as their own token type. while, for instance, should be lexed as the token type WHILE, not as an identifier representing an object called “while”.
  • Make sure you correctly register the position of all tokens.
  • In general, it is good to output as many errors as possible (this helps whoever uses your compiler). For instance, if your lexer encounters an invalid character, it can output an error message, skip it, and keep on lexing the rest of the input. After lexing, the compiler still won't proceed to the next phase, but this helps the user correct more than one error per compilation. Use the special BAD token type to mark errors and keep lexing as long as it is possible.

Code Stubs

You can download the stub for this lab here. Here is a short description of some of the files.

  • lexer/Tokens.scala: stub for a file describing token types and tokens.
  • Positional.scala: the positional trait, nothing to change in there.
  • lexer/Lexer.scala: stub for the Lexer trait.
  • Reporter.scala: the complete Reporter trait, nothing to change.
  • Compiler.scala: the main class for the compiler, that combines the other traits. Contains only a testing method for now, you don't have to edit it.
  • Main.scala: some code to help you test your lexer.
  • testcases/: some example Tool programs.

Mind the package names. The structure of your project src directory should be as follows:

src
 └── toolc
      ├── Compiler.scala
      ├── Main.scala
      ├── Positional.scala
      ├── Reporter.scala
      │
      └── lexer
           ├── Lexer.scala
           └── Tokens.scala
  

Example Output

For reference, here is a possible (incomplete) output for the factorial program:

OBJECT(1:1) ID('Factorial')(1:8) LBRACE(1:18) DEF(2:5) MAIN(2:9) LPAREN(2:13)
RPAREN(2:14) COLON(2:16) UNIT(2:18) EQSIGN(2:23) LBRACE(2:25) PRINTLN(3:9)
LPAREN(3:16) NEW(3:17) ... RETURN(14:9) ID('num_aux')(14:16) SEMICOLON(14:23)
RBRACE(15:5) RBRACE(16:1) EOF(16:2) 

You can also use the reference compiler with the flag

java -jar toolc-reference-?.?.jar toolc.Main --tokens

to show the tokens of any Tool source file. Note that this is just for reference and that you can name your tokens however you wish.

Deliverable

Please choose a commit from your git repository as a deliverable on our server before Tuesday, Oct. 11th, 11.59pm (23h59). We will scan your repository for a directory called src and compile all the .scala files below it.