Practice as many as possible. See previous course pages for sample problems.
*******************************************************************************
(1) Consider the following CFG
S > T  V
T > UU
U > aUb  ab
V > aVb  aWb
W > bWa  ba
(a) What is the language that this CFG produces?
Since we see a union in the first production, we can express the language as
a union of two languages.
U produces strings in which there are equal number of 'a' and 'b'.
[[U]] = {a^n b^n n >=1 }
It is important to write the condition on 'n' as we make it clear that there
are no empty strings in the language.
For T, we have a concatenation of the above language with itself, except that
the exponent might be different.
[[T]] = {a^n b^n a^m b^m  n,m >= 1 }
If we write some sample strings for the V production, we can see that the
pattern looks like,
[[V]] = {a^n b^m a^m b^n  n,m >= 1}
Therefore
[[S]] = {a^n b^n a^m b^m  n,m >= 1 } U {a^n b^m a^m b^n  n,m >= 1}
IMP: Write the grammar given the language.
(b) Is the grammar ambiguous? If yes, give an example string and write the
derivations to prove ambiguity.
The grammar is ambiguous for all strings where n=m. Ex: abab
S=>T=>UU=>abU=>abab
S=>V=>aWb=>abab
*******************************************************************************
(2) Consider the following CFG.
E > T+E  TE  T*E  T
T > a  b  c  (E)
(a) Construct the parse tree for c*ab
E
/  \
T * E
 /  \
c T  E
 
a T

b
Are there any more?
(b) Convert the above CFG to enforce the following precedence:
Grouping by parenthesis (highest precedence)
Multiplication
Addition or Subtraction (lowest precedence)
E > T+E  TE  T
T > U*T  U
U > a  b  c  (E)
(c) Now write the parse tree for c*ab
E
/  \
T  E
/  \ 
U * T T
  
c U U
 
a b
*******************************************************************************
(3) Write CFGs for the following languages.
(a) Palindromes over the alphabet {a,b,c}
S > aSa  bSb  cSc  a  b  c  eps
(b) {a^m b^n c^p d^q  m + n = p + q ; m,n,p,q >=0}
The sums can be equal if for every 'a' we have either 'c' or 'd'.
Similarly for every 'b' we need another 'c' or 'd'. So we start off with
S > aSc  aSd  bSc  bSd  eps
But the above CFG doesn't maintain the order of characters i.e. it can
generate invalid strings like "abadcd".
So we will use different nonterminals for each case in the production.
1. S > aTc  aUd  bVc  bWd  eps
Between 'a' and 'c', we can have 'b' or more of 'a' and 'c'.
Hence we have the following production which still maintains the sums.
2. T > aTc  bVc  eps
Between 'a' and 'd', we can insert any of the characters while maintaining
the sums. This turns out to be the same production as S i.e. we can replace
U with S in the production (1).
Between 'b' and 'c', we can only have more of 'b' and 'c'. Hence
3. V > bVc  eps
And the remaining production is
4. W > bWd  bVc  eps
The final CFG is
1. S > aTc  aSd  bVc  bWd  eps
2. T > aTc  bVc  eps
3. V > bVc  eps
4. W > bWd  bVc  eps