These automata are used to construct LR(0) and SLR parsers
We describe their states and transitions
If was original start symbol, we
We will construct states as sets of items
LR(0) item is a dotted production
Note: this is similar to Earley Parser item
We use letters I,J to denote sets of LR(0) items
Example:
Rules in our simple example grammar:
D' ::= E eof E ::= T E ::= E + T T ::= ID
Example state is , equal to the following set of items:
{ D' ::= .E eof, E ::= .T, E ::= .E + T, T ::= .ID }
These functions map sets of items to sets of items
Closure is similar to applying prediction in Earley Parser.
If is a set of states, then closure(I) is least set such that
Example: Take the singleton set
{ D' ::= .E eof }
Then closure of this set is :
{ D' ::= .E eof, E ::= .T, E ::= .E + T, T ::= .ID }
where rules are added in this order using the definition.
Goto corresponds to
Example: Let us compute goto(, ID). Because only one element contains .ID, we obtain
{ T ::= ID. }
and the closure remains the same because . does not appear before a non-terminal.
I1 = goto(I0, E) = { E ::= E . + T } I2 = goto(I1, +) = { E ::= E + . T , T ::= . ID } I3 = goto(I2, T) = { E ::= E + T. }
Initial state
Example: What we denoted was indeed a correct initial state.
Edges are labeled by grammar terminals and non-terminals
States and edges are the least sets (T,E) such that
The automaton in this example is deterministic
The automaton represents possible sets of slices through syntax trees. The state records, for each non-terminal the way in which it's last ocurrence was unfolded in the tree.
Consider a parse tree for
ID + ID + ID ^
when parser has read only token ID and +, we have the following cut through the tree, and the last expansions of E and T describe the state.