Due Monday October 8th by 10:15am sharp (right before the class).
Find a regular expression that generates all alternating sequences of 0 and 1 with arbitrary length (including lengths zero, one, two, …). For example, the alternating sequences of length one are 0 and 1, length two are 01 and 10, length three are 010 and 101. Note that no two adjacent character can be the same in an alternating sequence.
Construct a DFA for the language of well-nested parenthesis with a maximal nesting depth of 3. For example, ε, ()(), (()(())) and (()())()(), but not (((()))) nor (()(()(()))), nor ())).
By well-nested parentheses we mean those that are correctly nested, and are thus given by the grammar such as the following:
An alternative way to describe such strings is to say that:
- the total number of open and closed parentheses in is the same;
- in each prefix of (i.e. in such that ) the number of closed parantheses in is less than or equal the number of open parantheses in .
Recall that the general algorithm for minimizing finite automata works in reverse. First, find all pairs of inequivalent states. States X, Y are inequivalent if X is final and Y is not, or (by iteration) if and and X' and Y' are inequivalent. After this iteration ceases to find new pairs of inequivalent states, then X, Y are equivalent, if they are not inequivalent.
Let be any deterministic finite automaton. Assume that contains exactly states. Show that if it accepts at least one string of length or greater then the accepted language is infinite.
Let rtail be a function that returns all the symbols of a string except the last one. For example, rtail(Lexer) = Lexe. rtail is undefined for an empty string. If is a regular expression, then applies the function to all the elements. For example, is the language that can be described by a regular expression .
Prove that is regular if is not nullable.
Bonus part: Let denote the length of the string . Let be a regular language. Is the language always regular?
Consider a language with the following tokens and token classes:
ident ::= letter (letter|digit)* LT ::= "<" GT ::= ">" shiftL ::= "<<" shiftR ::= ">>" dot ::= "." LP ::= "(" RP ::= ")"
Give a sequence of tokens for the following character sequence, applying the longest match rule:
Note that the input sequence contains no space character. You can separate the tokens in the output sequence by a comma for the purpose of presentation, like so:
LP, ident, ...