Labs 03

This assignment is the first real part of the Tool compiler project. Make sure you read the general project overview page first. Note that the page now contains more information than last week.

Please note that this lab is to be done in one week.

As testcases, you can use the ones that you and the other students in this class wrote. They're available for download here.


Write the lexer for Tool. Here are some details you should pay attention to:

  • Make sure you recognize keywords as their own token type. while, for instance, should be lexed as the token type WHILE, not as an identifier representing an object called “while”.
  • Make sure you correctly register the position of all tokens.
  • In general, it is good to output as many errors as possible (this helps whoever uses your compiler). For instance, if your lexer encounters an invalid character, it can output an error message, skip it, and keep on lexing the rest of the input. After lexing, the compiler still won't proceed to the next phase, but this helps the user correct more than one error per compilation. Use the special BAD token type to mark errors and keep lexing as long as it is possible.


You can download the stub for this lab here. Here is a short description of some of the files.

  • lexer/Tokens.scala: stub for a file describing token types and tokens.
  • Positional.scala: the positional trait, nothing to change in there.
  • lexer/Lexer.scala: stub for the Lexer trait.
  • Reporter.scala: the complete Reporter trait, nothing to change.
  • Compiler.scala: the main class for the compiler, that combines the other traits. Contains only a testing method for now, you don't have to edit it.
  • Main.scala: some code to help you test your lexer.
  • testcases/: some example Tool programs.

Mind the package names. The structure of your project src directory should be as follows:

 └── toolc
      ├── Compiler.scala
      ├── Main.scala
      ├── Positional.scala
      ├── Reporter.scala
      └── lexer
           ├── Lexer.scala
           └── Tokens.scala

You can compile the project with:

sbt package

and a runner script with:

sbt script

(Note that this script doesn't require a local installation of Scala, it uses the same one as sbt, as opposed to in the first lab.)

You can then try it out on an example from the testcases directory. The stub only returns the token EOF.

Example output

For reference, here is a possible (incomplete) output for the factorial program:

OBJECT(1:1) IDFactorial(1:8) LBRACE(1:18) DEF(2:5) MAIN(2:9) LPAREN(2:13) RPAREN(2:14)
COLON(2:16) UNIT(2:18) EQSIGN(2:23) ... RPAREN(13:59) RPAREN(13:60) SEMICOLON(13:61)
RETURN(14:9) IDnum_aux(14:16) SEMICOLON(14:23) RBRACE(15:5) RBRACE(16:1) EOF(16:2)

You can also use the reference compiler with the flag

java -jar toolc-reference.jar toolc.Main --tokens

to show the tokens of any Tool source file. Note that this is just for reference and that you can name your tokens however you wish. Note that the reference compiler does not print out the position of the tokens, but yours has to do it, as it is a part of the assignment that the positions are set properly.


Hand in the sources of your project (directory src and below) compressed in a zip file through Moodle by Tuesday, October 12th, 11:55pm (23h55). Please make sure of the following:

  • send us a .zip file, not a .rar file
  • do not include .class files
  • do not include .svn directories and the like