LARA

Homework 02

Due Monday, 17th October, 10:15am. Please hand in before the lecture.

For this homework, recall that a parse tree is a tree that represents the structure of the string to be parsed according to some grammar.
It can be defined recursively as follows:
- If t is a terminal, then t() is a parse tree.
- Given the parse trees t1(…), t2(..), …, t3(…) and a production rule Y → t1 t2 … t3, then Y(t1(…), t2(…), …, t3(…)) is also a parse tree.

Problem 1

Show that the regular languages can be recognized with LL(1) parsers. Describe a process that, given a regular expression, constructs an LL(1) parser for it.

Problem 2

A grammar is right regular if and only if the production rules are of the following form:

X -> a
X -> aY
X -> ε

where a is a terminal, X, Y are non-terminals. A left-regular grammar is defined analogously, with the second rule being X → Za.

Consider the following right regular grammar over $\Sigma = \{\#,!\}$:

S -> #A | !# | !
A -> !  | #B | ε
B -> #B | !

a) Determine how the input string '###!' is derived.
b) Find a regular expression that accepts the same language as this grammar.
c) Show that a language is regular if and only if there exists a regular grammar that generates it. I.e. show that for each regular expression there is a regular grammar that defines the same language, and vice versa. (A regular grammar is a grammar that is either right or left regular.)

Problem 3

The following grammar is motivated by type definitions in Scala.

TypeDefinition  := "type" Identifier "=" Type
Type            := Type ","  Type
Type            := Type "=>" Type
Type            := "Int"  |  "Boolean"

a) By giving at least one input which has two different parse trees prove that the grammar is ambiguous.
b) Given the fact that “=>” has a lower precedence than “,” resolve the ambiguity of the grammar. That is, write a new grammar that determines the same language but is not ambiguous, such that the Type “Int ⇒ Int, Int” is parsed as function from Int to Int,Int.

Problem 4

The following grammar generates all the regular expressions over $\Sigma = \{a,b,(,),+,*\}$.

S -> S S
S -> S '+' S
S -> S '*'
S -> '(' S ')'
S -> 'a' | 'b'

a) Compute the First and Follow sets for S.
b) By giving two different parse trees for an input, show that the grammar is ambiguous.
c) Assuming that the '*' has the highest and '+' has the lowest precedence, find a grammar that accepts the same language but is unambiguous.
d) Compute the First and Follow sets for the non-terminals of the new grammar.
e) Is the new grammar LL(1)?

Problem 5

A grammar is called LL(k) if it needs k tokens of lookahead when parsing a sentence. Does the following grammar belong to LL(k)? Justify your answer and find the smallest k you can, if applicable.

S -> AB
S -> BC
A -> a
A -> Ba
B -> Cc
C -> c
B -> a