====== Automata for LR Parsing without Lookahead ======

These automata are used to construct LR(0) and SLR parsers
  * same automata, SLR just uses some extra information

We describe their states and transitions

If $D$ was original start symbol, we 
  * add (D'::=D eof) to grammar
  * make $D'$ new start symbol

===== States: Sets of Items =====

We will construct states as sets of **items**

**LR(0) item** is a dotted production $X::=p.q$
  * $X::=pq$ is grammar rule 
  * $p,q$ some strings of terminals and non-terminals

Note: this is similar to [[Earley Parser]] item
  * but there is no more initial position!

We use letters I,J to denote **sets** of LR(0) items
  * approximates a set of paths in a syntax tree
  * left side of tree corresponds to input seen so far


**Example:**

Rules in our simple example grammar:
  D' ::= E eof
  E ::= T
  E ::= E + T
  T ::= ID

Example state is $I_0$, equal to the following set of items:
  { D' ::= .E eof,
    E ::= .T,
    E ::= .E + T,
    T ::= .ID }

===== Transitions: Closure and Goto Functions =====

These functions map sets of items to sets of items

**Closure** is similar to applying prediction in [[Earley Parser]].

If $I$ is a set of states, then closure(I) is least set such that
  * I $\subseteq$ closure(I)
  * if $(X::=p.Yq)\in I$ and $(Y::=r)$ is a grammar rule, then $(Y::=.r)\ \in\ closure(I)$

**Example:** Take the singleton set
  { D' ::= .E eof }
Then closure of this set is $I_0$:
  { D' ::= .E eof,
    E ::= .T,
    E ::= .E + T,
    T ::= .ID }
where rules are added in this order using the definition.

**Goto** corresponds to 
  * scan in [[Earley Parser]] when X is a terminal
  * completion in [[Earley Parser]] when X is a non-terminal
\begin{equation*}
   goto(I,X) = closure(\{ Y::=qX.r \mid (Y::=q.Xr)\in I \})
\end{equation*}

**Example:** Let us compute goto($I_0$, ID). Because only one element contains .ID, we obtain
  { T ::= ID. }
and the closure remains the same because . does not appear before a non-terminal.

  I1 = goto(I0, E) = { E ::= E . + T }
  I2 = goto(I1, +) = { E ::= E + . T , T ::= . ID }
  I3 = goto(I2, T) = { E ::= E + T. }

===== Constructing Automaton =====

Initial state $I_0 = closure(\{D'::=.D \verb! EOF!\})$

**Example:** What we denoted $I_0$ was indeed a correct initial state.

Edges are labeled by grammar terminals and non-terminals
  * they are $\Delta$ function of the [[:Finite state machine]]

States and edges are the least sets (T,E) such that
  * $I_0 \in T$
  * for each terminal and non-terminal $X$ and each $I \in T$, if $J=goto(I,X)$ then
    * $J \in T$
    * $(I,X,J) \in E$

===== Examples =====

{{cc09:lr0.png?640|LR0 Automaton}}

The automaton in this example is deterministic
  * empty set is dead state
  * we do not represent dead state or its edges

The automaton represents possible sets of slices through syntax trees. The state records, for each non-terminal the way in which it's last ocurrence was unfolded in the tree.

Consider a parse tree for
  ID +   ID + ID
       ^     

when parser has read only token ID and +, we have the following cut through the tree, and the last expansions of E and T describe the state.

{{cc09:parse-tree.png|Parse Tree for Expressions}}