# Automata for LR Parsing without Lookahead

These automata are used to construct LR(0) and SLR parsers

• same automata, SLR just uses some extra information

We describe their states and transitions

If was original start symbol, we

• add (D'::=D eof) to grammar
• make new start symbol

## States: Sets of Items

We will construct states as sets of items

LR(0) item is a dotted production

• is grammar rule
• some strings of terminals and non-terminals

Note: this is similar to Earley Parser item

• but there is no more initial position!

We use letters I,J to denote sets of LR(0) items

• approximates a set of paths in a syntax tree
• left side of tree corresponds to input seen so far

Example:

Rules in our simple example grammar:

D' ::= E eof
E ::= T
E ::= E + T
T ::= ID

Example state is , equal to the following set of items:

{ D' ::= .E eof,
E ::= .T,
E ::= .E + T,
T ::= .ID }

## Transitions: Closure and Goto Functions

These functions map sets of items to sets of items

Closure is similar to applying prediction in Earley Parser.

If is a set of states, then closure(I) is least set such that

• I closure(I)
• if and is a grammar rule, then

Example: Take the singleton set

{ D' ::= .E eof }

Then closure of this set is :

{ D' ::= .E eof,
E ::= .T,
E ::= .E + T,
T ::= .ID }

where rules are added in this order using the definition.

Goto corresponds to

Example: Let us compute goto(, ID). Because only one element contains .ID, we obtain

{ T ::= ID. }

and the closure remains the same because . does not appear before a non-terminal.

I1 = goto(I0, E) = { E ::= E . + T }
I2 = goto(I1, +) = { E ::= E + . T , T ::= . ID }
I3 = goto(I2, T) = { E ::= E + T. }

## Constructing Automaton

Initial state

Example: What we denoted was indeed a correct initial state.

Edges are labeled by grammar terminals and non-terminals

States and edges are the least sets (T,E) such that

• for each terminal and non-terminal and each , if then

## Examples

The automaton in this example is deterministic

• empty set is dead state
• we do not represent dead state or its edges

The automaton represents possible sets of slices through syntax trees. The state records, for each non-terminal the way in which it's last ocurrence was unfolded in the tree.

Consider a parse tree for

ID +   ID + ID
^     

when parser has read only token ID and +, we have the following cut through the tree, and the last expansions of E and T describe the state.