Tokens (Words) of While Language
token = word (in a general sense)
What are the tokens in squares.while example?
// Prints n^2 for n=1...10
i = 0;
j = 1;
while (i < 10) {
println("", j);
i = i + 1; // increment counter
j = j + 2*i+1;
}
Several token types (kinds of tokens):
| token type | examples | description | regular expression |
|---|---|---|---|
| identifier | i j sum | sequence of letters and digits starting with a letter | letter (letter |digit)* |
| keyword | while println | special identifiers | |
| integer constant | 0 10 1 | nonempty sequence of digits | digit digit* |
| strings | “hello” “value is:” | sequence of characters in quotes | “ anyCharExceptQuote* ” |
| EQUAL | = | = | = |
| PLUS | + | + | + |
| LEQ | <= | <= | <= |
| LESS | < | < | < |
| COMMA | , | , | , |
| LPAREN | ( | ( | ( |
| … | … | … | … |
Note: we could treat each keyword as a separate token type, whatever is more convenient.
We also describe parts treated as white space (ignored):
- space
- tab
- end of line signs (CR,LF)
- end of line comments starting with Can we find a regular expression for * end of line comment? * white space? * nested comments (useful to comment out some code) /* perhaps some code /* and nested comment */ */