This is an old revision of the document!
Context-Free Grammars
Purpose and Expressive Power
A context-free grammar is a way to specify languages.
Context-free grammars are more powerful than regular expressions: for each regular expression there is an equivalent context-free grammar, but the converse need not hold. Context-free grammars supposed not just iteration like regular expressions, but also more general forms of recursion.
ab Example
Strings of the form for can be described by this recursive grammar
S ::= "ab" | "a" S "b"
This language is not regular.
Definition
Definition: Context-free grammar consists of
- set of terminal symbols (letters of the alphabet)
- set of non-terminal symbols (used to recursively define sets of strings)
- one of the non-terminal symbols that is designed to be the starting non-terminal
- a finite set of productions, which are grammar rules of the form
N ::= s1 ... sn
where N is a non-terminal, and each s1,…,sn is either a terminal or a non-terminal. The sequence s1…sn can be empty.
A string of terminals is generated by a grammar iff we can obtain this string by transforming sequences of terminals and non-terminals as follows:
- the initial string contains only starting non-terminal
- at each step, we select a non-terminal N in our string and select a production with N as left-hand side, and replace non-terminal with the right-hand side, obtaining string (this process is denoted )
means is obtained from by one or more steps
means is obtained from by zero or more steps
\[
L(G) = \{ w \mid w \in \Sigma^*, \ S \Rightarrow^* w \}
\]
Shorthands for Writing Context-Free Grammars
Note: we group multiple productions with same left-hand side, writing
N -> p | q
instead of
N -> p N -> q
Also, we allow nesting, writing e.g.
N -> p (q | r) s
instead of
N -> p M s M -> q | r
Finally, we allow the use of star as in regular expressions, because
N -> p*
is the same as
N -> "" | p N
Example of generating string aaabbb:
S -> aSb -> aaSbb -> aaabbb
Example: Polynomials
Polynomials are expressions that use variables, constants, operators +, * and parentheses. We can describe strings representing polynomials as follows:
polynomial ::= term | term "+" polynomial term ::= factor | factor "*" term factor ::= constant | variable | "(" polynomial ")"
we can also describe variables and constants:
variable ::= letter | variable digitSequence constant ::= digitSequence | "-" digitSequence digitSequence ::= digit | digit digitSequence digit ::= "0" | "1" | "2" | ... | "9" letter ::= "a" | "b" | "c" | ... | "x" | "y" | "z"
(Each table entry represents one step in relation .)
Example: English Sentences
This generates certain simple English sentences.
Sentence = Simple | Belief Simple = Person liking Person liking = "likes" | "does" "not" "like" Person = "Barack" | "Joe" | "John" | "Sarah" | "Snoopy" Belief = Person believing "that" Sentence optBut believing = "believes" | "does" "not" "believe" optBut = "" | "," "but" Sentence
Example sentence generated by above grammar:
“John does not believe that Barack believes that Sarah likes Snoopy, but Snoopy believes that Barack likes Joe.”
- examples of using grammars for text generation