Lecturecise 03: Longest Match. Generating Lexical Analyzers
Key insights:
- we follow the maximal munch rule: lexical analyzer should always eagerly accept the longest token that it can recognize from the current point
- it is possible to automate the construction of lexical analyzers; the starting point of this construction is a conversion of regular expressions to deterministic automata
- tools that automate this construction are part of compiler-compilers such as JavaCC
References
- Tiger book, Chapters 1-2
- Compiler Construction by Niklaus Wirth, Chapters 1-3
Background on regular languages and automata
- Regular Languages and Finite Automata from Andrew M. Pitts