CYK Parsing Algorithm for General Context-Free Grammars
Assume the grammar is in Chomsky Normal Form.
Example Grammar
S ::= L R | S S | L X X ::= S R L ::= "{" R ::= "}"
Consider an input string
{ { } { } { } }
For each terminal t in input, for which non-terminal X is it the case that X ?
{ { } { } { } } length of substring L L R L R L R R 1 S S S 2 X 3 S S 4 X 5 S 6 X 7 S 8
For each string w of length 2 in input, for which non-terminal X is it the case that X ?
Dynamic programming algorithm: for each substring, determine which non-terminals can generate it.
Let be input word
Let d(i)(j) denote non-terminals deriving substring of from i to j.
CYK algorithm:
INPUT: word w, grammar G in Chomsky normal form OUTPUT: true iff (w in L(G)) N = |w| var d : Array[N][N] forall i in {1..N} { d(i)(i) = {X | G contains X->w(i)} for j in {i + 1 .. N} d(i)(j) = {} } for k = 2 to N // substring length for i = 0 to N-k // initial position for j = 1 to k-1 // length of first half for each (X::=Y Z) in G if Y in d(i)(i+j-1) and Z in d(i+j)(i+k-1) d(i)(j) = d(i)(j) union {X} return (S in d(0,N-1))
A Programming Language Example
Consider this fragment of a grammar of language with references and procedure calls
statement ::= assign | call assign ::= expr "=" expr call ::= expr "." ID "(" expr ")" expr ::= ID | expr "." expr
Is this grammar LL(1)?
What is its Chomsky Normal Form?
Additional information:
Number of Distinct Parse Trees
Let be the string
()()()
It has 2 parse trees
What is the number of parse trees for a string of the form where repeats times?
What is the running time of the CYK parser on this input?
How to generate one or all syntax trees?