LARA

Automata for LR Parsing without Lookahead

These automata are used to construct LR(0) and SLR parsers

We describe their states and transitions

LR(0) item is a dotted production $X::=p.q$

  • $X::=pq$ is grammar rule
  • $p,q$ some strings of terminals and non-terminals

Note: this is similar to Earley Parser item

  • but the initial position is thrown away

We use letters I,J to denote sets of LR(0) items

  • approximates a set of paths in a possible syntax tree
  • left side of tree corresponds to input seen so far

Again, if $D$ was original start symbol, we

  • add (D'::=D EOF) to grammar
  • make $D'$ new start symbol

Closure and Goto Functions

These functions map sets of items to sets of items

Closure is similar to applying prediction in Earley Parser.

closure(I) is least set such that

  • I $\subseteq$ closure(I)
  • if (Y::=r) is a rule and $(X::=p.Yq)\in I$, then (Y::=.r) $\in$ closure(I)

Goto corresponds to

\begin{equation*}
   goto(I,X) = closure(\{ Y::=qX.r \mid (Y::=q.Xr)\in I \})
\end{equation*}

Constructing Automaton

Initial state $I_0 = closure(\{D'::=.D \verb! EOF!\})$

Edges are labeled by grammar terminals and non-terminals

States and edges are the least sets (T,E) such that

  • $I_0 \in T$
  • for each terminal and non-terminal $X$ and each $I \in T$, if $J=goto(I,X)$ then
    • $J \in T$
    • $(I,X,J) \in E$

This automaton is deterministic

  • empty set is dead state
  • we do not represent dead state or its edges

Running automaton from initial state on stack results in state K

  • the state K determines the action of LR parser
  • if dead state, we have parse error