Lecturecise 02: Formal Languages and Lexical Analysis
Terminology: lexical analyzer = lexer = scanner
Key insights:
- lexical analyzer maps a stream of characters into a stream of tokens
- while doing that, it typically needs only finite memory
- we can specify tokens for a lexical analyzers using regular expressions
- it is not difficult to construct a lexical analyzer manually – we give an example
- in such case, we often use the first character to decide on token class; there is a notion first(L) = { a | aw in L }
References
- Tiger book, Chapters 1-2
- Compiler Construction by Niklaus Wirth, Chapters 1-3
Background on regular languages and automata
- Regular Languages and Finite Automata from Andrew M. Pitts