Tokens (Words) of While Language
token = word (in a general sense)
What are the tokens in squares.while example?
// Prints n^2 for n=1...10 i = 0; j = 1; while (i < 10) { println("", j); i = i + 1; // increment counter j = j + 2*i+1; }
Several token types (kinds of tokens):
token type | examples | description | regular expression |
---|---|---|---|
identifier | i j sum | sequence of letters and digits starting with a letter | letter (letter |digit)* |
keyword | while println | special identifiers | |
integer constant | 0 10 1 | nonempty sequence of digits | digit digit* |
strings | “hello” “value is:” | sequence of characters in quotes | “ anyCharExceptQuote* ” |
EQUAL | = | = | = |
PLUS | + | + | + |
LEQ | <= | <= | <= |
LESS | < | < | < |
COMMA | , | , | , |
LPAREN | ( | ( | ( |
… | … | … | … |
Note: we could treat each keyword as a separate token type, whatever is more convenient.
We also describe parts treated as white space (ignored):
- space
- tab
- end of line signs (CR,LF)
- end of line comments starting with Can we find a regular expression for * end of line comment? * white space? * nested comments (useful to comment out some code) /* perhaps some code /* and nested comment */ */