Here are the answers for the practice problems. I strongly suggest that you DO the problems before you look at the solutions. If you just follow along the solutions without actually doing the problem yourself, you will have problems with the midterm. Chau-Wen P.S. If you notice any mistakes in the solutions, please send me email. ------------------------------------------------------------------------- 1 a) strings beginning and ending in 0 b) all strings c) strings with 0 as third digit from right ------------------------------------------------------------------------- 2 a) 1*(0|01)* // after any 0, only allow zero or one 1 b) 1*0*(1|epsilon)0* // only a single 1 after first 0 ------------------------------------------------------------------------- 3 a) NFA for RE = (a|b)* e ------------------------------ / \ / e a \ | -----> S2 -----> S4 --- | S V / \ e | ---- t e / ---> e //--\\ a ---> S1 --> S6 ----->||S7|| r \ / e \\--// t | \ e b / ---- | -----> S3 -----> S5 --- ^ \ | \ e / ------------------------------------------- (State 7 is the final state) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NFA for RE = (a*|b*)* e ------ | | e V a | e ---> S2 ---> S4 ---> S6 ---> S8 - / | ^ \ e/ | e | \ / ----------------------- | S / V ----- t e e e //---\\ a --->S1 <----------------------------------- S10 ----->||S11|| r | \ \\---// t | \ e ^ ----- | e\ ------ | ^ | \ | | / | | \ e V b | e / | \ --> S3 ---> S5 ---> S7 ---> S9 - / \ | ^ / \ | e | / \ ---------------------- / \ / \ e / --------------------------------------- (State 11 is the final state) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ b) Sequence of moves for DFA for (a|b)* in parsing "ababbab" 1,2,4,6, // a 1,3,5,6, // b 1,2,4,6, // a 1,3,5,6, // b 1,3,5,6, // b 1,2,4,6, // a 1,3,5,6,7 // b Sequence of moves for DFA for (a*|b*)* in parsing "ababbab" 1,2,4,6,8,10, // a 1,3,5,7,9,10, // b 1,2,4,6,8,10, // a 1,3,5,7,9,10, // b 1,3,5,7,9,10, // b 1,2,4,6,8,10, // a 1,3,5,7,9,10,11 // b Another sequence of moves for DFA for (a*|b*)* in parsing "ababbab" 1,2,4,6,8,10, // a 1,3,5,7,9,10, // b 1,2,4,6,8,10, // a 1,3,5,7,5,79,10, // b,b 1,2,4,6,8,10, // a 1,3,5,7,9,10,11 // b ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ c) DFA for (a|b)* from NFA using subset construction State 0 = { 1,2,3,7 } State 1 = Move(State 0, a) = { 1,2,3,4,6,7 } State 2 = Move(State 0, b) = { 1,2,3,5,6,7 } Move(State 1, a) = State 1 Move(State 1, b) = State 2 Move(State 2, a) = State 1 Move(State 2, b) = State 2 S ---- ---- t //--\\ a //--\\ <---- a --->|| 0||------------->|| 1|| |a r \\--// \\--// ----- t ---- \ ---- \ | ^ \ b| |a \ V | \ ---- \ b //--\\ <---- ------->|| 2|| |b \\--// ----- ---- (State 0 is the start state) (States 0,1,2 are all final states, since they include State 7 of the DFA) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ DFA for (a*|b*)* from NFA using subset construction State 0 = { 1,2,3,4,5,8,9,10,11 } State 1 = Move(State 0, a) = { 1,2,3,4,5,6,8,9,10,11 } State 2 = Move(State 0, b) = { 1,2,3,4,5,7,8,9,10,11 } Move(State 1, a) = State 1 Move(State 1, b) = State 2 Move(State 2, a) = State 1 Move(State 2, b) = State 2 S ---- ---- t //--\\ a //--\\ <---- a --->|| 0||------------->|| 1|| |a r \\--// \\--// ----- t ---- \ ---- \ | ^ \ b| |a \ V | \ ---- \ b //--\\ <---- ------->|| 2|| |b \\--// ----- ---- (State 0 is the start state) (States 0,1,2 are all final states, since they include State 11 of the DFA) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ d) Minimization of DFA for (a|b)* and (a*|b*)* [same DFA] Initial Partition: Partition 0 = { All final states } = { State 0, 1, 2} Partition 1 = { All nonfinal states } = { } Refine Partition 0: a State 0 ---> State 1 (partition 1 to partition 1) b State 0 ---> State 2 (partition 1 to partition 1) a State 1 ---> State 1 (partition 1 to partition 1) b State 1 ---> State 2 (partition 1 to partition 1) a State 2 ---> State 1 (partition 1 to partition 1) b State 2 ---> State 2 (partition 1 to partition 1) ( Partition 0 can not be split further ) Resulting minimal DFA: -- a| | S | V t ---- a //--\\ <---- r --> || 0|| |b t \\--// ----- (State 0 is the start state. State 0 is also a final state) ------------------------------------------------------------------------- 4 a) Grammar for all strings of 0's and 1's with same number of 0's and 1's S -> A S B | B S A | epsilon // strings with same # of 0's & 1's A -> 1 S | S 1 // strings with one more 1 than 0 B -> 0 S | 0 S // strings with one more 0 than 1 or S -> 1 S 0 S | 0 S 1 S | epsilon // strings with same # of 0's & 1's Both grammars are ambiguous b) Grammar for all strings of 0's and 1's with more 0's than 1's S' -> B S' | B // strings with more 0's than 1's S -> A S B | B S A | epsilon // strings with same # of 0's & 1's A -> 1 S | S 1 // strings with one more 1 than 0 B -> 0 S | 0 S // strings with one more 0 than 1 Grammar is ambiguous c) Grammar for all strings with balanced parentheses S -> S S | ( S ) | epsilon ------------------------------------------------------------------------- 5 a) Parse trees for (a,a) (a,(a,a)) S S /|\ /|\ ( L ) ( L ) /|\ /|\ L , S L , S | | | /|\ S a S ( L ) | | /|\ a a L , S | | S a | a b) S -> (L) -> (L,S) -> (S,S) -> (a,S) -> (a,a) S -> (L) -> (L,S) -> (S,S) -> (a,S) -> (a,(L)) -> (a,(L,S)) -> (a,(S,S)) -> (a,(a,S)) -> (a,(a,a)) c) S -> (L) -> (L,S) -> (L,a) -> (S,a) -> (a,a) S -> (L) -> (L,S) -> (L,(L)) -> (L,(L,S)) -> (L,(L,a)) -> (L,(S,a)) -> (L,(a,a)) -> (S,(a,a)) -> (a,(a,a)) ------------------------------------------------------------------------- 6 a) S -> aSbS -> abS -> abaSbS -> ababS -> abab S -> aSbS -> abSaSbS -> abaSbS -> ababS -> abab b) S -> aSbS -> aSbaSbS -> aSbaSb -> aSbab -> abab S -> aSbS -> aSb -> abSaSb -> abSab -> abab Example parse trees S S / | \ / | \ / | | \ / | | \ a S b \ a S b S | S / | \ \ e / | \ / | | \ e / | | \ b S a S a S b S | | | e e e ------------------------------------------------------------------------- 7) a) S -> (L) | a with no left recursion S -> (L) | a L -> L,S | S -----------------------> L -> S L' (where e = empty str) L' -> , S L' | e ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ b) recursive descent parser (with lookahead) tok; // current token match(x) { // matches token if (tok != x) // if wrong token error(); // exit with error tok = getToken(); // get new token } parser() { tok = getToken(); // initialize S( ); // start symbol match("$"); // match EOF } S( ) { if (tok == "(" )) { // S -> ( L ) match("("); L(); match(")"); } else if (tok == "a")) // S -> a match("a"); else error(); } L ( ) { S(); L'(); // L -> S L' } L' ( ) { if (tok == ",") { // L' -> , S L' match(","); S(); L'(); } else // L' -> e ; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ c) example parse of (a,a) Sequence of function calls in parse. Indentation indicates nesting level // Remaining input S(); // (a,a) match("("); // a,a) L(); // a,a) S(); // a,a) match("a"); // ,a) L'(); // ,a) match(","); // a) S(); // a) match("a"); // ) L'(); // ) match(")"); // ------------------------------------------------------------------------- 8 a) Construct the canonical set of LR(0) items Grammar Augmented Grammar S -> AS | b S'-> S First(S') = a b Follow(S') = $ A -> SA | a S -> AS | b First(S) = b a Follow(S) = $ a b A -> SA | a First(A) = a b Follow(A) = a b Note that whenever S or A is about to be parsed, we have the nonkernel LR(0) items: [ S -> *AS ] NK = [ S -> *b ] [ A -> *SA ] [ A -> *a ] State 0: [ S'-> *S ] NK items State 1 = Goto(State 0, a) = [ A -> a* ] State 2 = Goto(State 0, b) = [ S -> b* ] State 3 = Goto(State 0, S) = [ S' -> S* ] [ A -> S*A ] [ S -> *AS ] \ [ S -> *b ] \ NK [ A -> *SA ] / Items [ A -> *a ] / State 4 = Goto(State 3, A) = [ A -> SA* ] [ S -> A*S ] [ S -> *AS ] \ [ S -> *b ] \ NK [ A -> *SA ] / Items [ A -> *a ] / State 5 = Goto(State 4, A) = [ S -> A*S ] [ S -> *AS ] \ [ S -> *b ] \ NK [ A -> *SA ] / Items [ A -> *a ] / State 6 = Goto(State 5, S) = [ S -> AS* ] [ A -> S*A ] [ S -> *AS ] \ [ S -> *b ] \ NK [ A -> *SA ] / Items [ A -> *a ] / State 7 = Goto(State 6, S) = [ A -> S*A ] [ S -> *AS ] \ [ S -> *b ] \ NK [ A -> *SA ] / Items [ A -> *a ] / [All state transitions] Goto(State 0, a) = State 1 Goto(State 3, a) = State 1 Goto(State 4, a) = State 1 Goto(State 5, a) = State 1 Goto(State 6, a) = State 1 Goto(State 7, a) = State 1 Goto(State 0, b) = State 2 Goto(State 3, b) = State 2 Goto(State 4, b) = State 2 Goto(State 5, b) = State 2 Goto(State 6, b) = State 2 Goto(State 7, b) = State 2 Goto(State 0, S) = State 3 Goto(State 3, A) = State 4 Goto(State 6, A) = State 4 Goto(State 7, A) = State 4 Goto(State 0, A) = State 5 Goto(State 4, A) = State 5 Goto(State 5, A) = State 5 Goto(State 4, S) = State 6 Goto(State 5, S) = State 6 Goto(State 3, S) = State 7 Goto(State 6, S) = State 7 Goto(State 7, S) = State 7 (State 3 is a final state because it contains [ S'-> S* ]) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ b) Construct the canonical set of LR(1) items S -> AS | b S'-> S First(S') = a b A -> SA | a S -> AS | b First(S) = a b A -> SA | a First(A) = a b Note that whenever S or A is about to be parsed with lookahead {a,b}, we have the nonkernel LR(1) items: [ S -> *AS , {a,b} ] NK = [ S -> *b , {a,b} ] [ A -> *SA , {a,b} ] [ A -> *a , {a,b} ] State 0 = start state = [ S'-> *S , $ ] [ S -> *AS , $ ] [ S -> *b , $ ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / State 1 = Goto(State 0, a) = [ A -> a* , {a,b} ] State 2 = Goto(State 0, b) = [ S -> b* , {a,b,$} ] State 3 = Goto(State 0, S) = [ S'-> S* , $ ] [ A -> S*A , {a,b} ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / State 4 = Goto(State 3, A) = [ A -> SA* , {a,b} ] [ S -> A*S , {a,b} ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / State 5 = Goto(State 4, A) = [ S -> A*S , {a,b} ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / State 6 = Goto(State 4, S) = [ S -> AS* , {a,b} ] [ A -> S*A , {a,b} ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / State 7 = Goto(State 6, S) = [ A -> S*A , {a,b} ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / State 8 = Goto(State 3, b) = [ S -> b* , {a,b} ] State 9 = Goto(State 0, A) = [ S -> A*S , {a,b,$} ] [ S -> *AS , $ ] [ S -> *b , $ ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / State 10= Goto(State 9, S) = [ S -> AS* , {a,b,$} ] [ A -> S*A , {a,b} ] [ S -> *AS , {a,b} ] \ [ S -> *b , {a,b} ] \ NK [ A -> *SA , {a,b} ] / Items [ A -> *a , {a,b} ] / [All state transitions] Goto(State 0, a) = State 1 Goto(State 3, a) = State 1 Goto(State 4, a) = State 1 Goto(State 5, a) = State 1 Goto(State 6, a) = State 1 Goto(State 7, a) = State 1 Goto(State 9, a) = State 1 Goto(State 10,a) = State 1 Goto(State 0, b) = State 2 Goto(State 9, b) = State 2 Goto(State 0, S) = State 3 Goto(State 3, A) = State 4 Goto(State 6, A) = State 4 Goto(State 7, A) = State 4 Goto(State 10,A) = State 4 Goto(State 4, A) = State 5 Goto(State 5, A) = State 5 Goto(State 4, S) = State 6 Goto(State 5, S) = State 6 Goto(State 3, S) = State 7 Goto(State 6, S) = State 7 Goto(State 7, S) = State 7 Goto(State 10,S) = State 7 Goto(State 3, b) = State 8 Goto(State 4, b) = State 8 Goto(State 5, b) = State 8 Goto(State 6, b) = State 8 Goto(State 7, b) = State 8 Goto(State 10,b) = State 8 Goto(State 0, A) = State 9 Goto(State 9, A) = State 9 Goto(State 9, S) = State 10 (State 3 is a final state because it contains [ S'-> S* , $ ]) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ c) Construct the LR(1) action/goto tables Num Production 1 S'-> S 2 S -> AS 3 S -> b 4 A -> SA 5 A -> a -------------------------------- | Action | Goto | | | | State | a | b | $ | S | A | ---------------------------------------- | 0 | s1 | s2 | | 3 | 9 | ---------------------------------------- | 1 | r5 | r5 | | | | ---------------------------------------- | 2 | r3 | r3 | r3 | | | ---------------------------------------- | 3 | s1 | s8 | acc | 7 | 4 | ---------------------------------------- | 4 | s1 | s8 | | 6 | 5 | | | r4 | r4 | | | | ---------------------------------------- | 5 | s1 | s8 | | 6 | 5 | ---------------------------------------- | 6 | s1 | s8 | | 7 | 4 | | | r2 | r2 | | | | ---------------------------------------- | 7 | s1 | s8 | | 7 | 4 | ---------------------------------------- | 8 | r3 | r3 | | | | ---------------------------------------- | 9 | s1 | s2 | | 10 | 9 | ---------------------------------------- | 10 | s1 | s8 | | 7 | 4 | | | r2 | r2 | r2 | | | ---------------------------------------- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ d) List and resolve conflicts From the ACTION/GOTO tables, it's clear to see that there are shift/reduce conflicts in states 4, 6, and 10 when the lookahead is either a or b. The actual LR(1) items in conflict in the states are: State [ A -> SA* , {a,b} ] reduce on a/b 4 [ S -> *b , {a,b} ] shift on b [ A -> *a , {a,b} ] shift on a States [ S -> AS* , {a,b} ] reduce on a/b 6 & 10 [ S -> *b , {a,b} ] shift on b [ A -> *a , {a,b} ] shift on a If we always shift for a shift/reduce conflict, we would never reduce A -> SA, and we would only reduce S -> AS with $ as lookahead. This is similar to removing the production A -> SA from the grammar. It would cause us to accept only the strings "ab", "aab", "aaab", etc. If we always reduce for a shift/reduce conflict, we never have more than two non-terminals on the stack, but should be able to generate all strings in the language. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ e) Example LR(1) parse for "abab" Some rightmost derivations S' -> S -> AS -> AAS -> AAb -> ASAb -> ASab -> Abab -> abab S' -> S -> AS -> Ab -> SAb -> Sab -> ASab -> Abab -> abab Stack Input Action/Goto ------------------------------------------------------------- $ 0 a b a b $ action(0,a) = s1 $ 0 a 1 b a b $ action(1,b) = r5 (A -> a) $ 0 A b a b $ goto(0,A) = 9 $ 0 A 9 b a b $ action(9,b) = s2 $ 0 A 9 b 2 a b $ action(2,a) = r3 (S -> b) $ 0 A 9 S a b $ goto(9,S) = 10 A: $ 0 A 9 S 10 a b $ action(10,a) = s1 or r2 (S -> AS) (assume we do s1 from A:) $ 0 A 9 S 10 a 1 b $ action(1,b) = r5 (A -> a) $ 0 A 9 S 10 A b $ goto(10,A) = 4 B: $ 0 A 9 S 10 A 4 b $ action(4,b) = s8 or r4 (A -> SA) (assume we do s8 from B:) $ 0 A 9 S 10 A 4 b 8 $ action(8,$) = error (assume we do r4 from B:) $ 0 A 9 A b $ goto(9,A) = 9 $ 0 A 9 A 9 b $ action(9,b) = s2 $ 0 A 9 A 9 b 2 $ action(2,$) = r3 (S -> b) $ 0 A 9 A 9 S $ goto(9,S) = 10 $ 0 A 9 A 9 S 10 $ action(10,$)= r2 (S -> AS) $ 0 A 9 S $ goto(9,S) = 10 $ 0 A 9 S 10 $ action(10,$)= r2 (S -> AS) $ 0 S $ goto(0,S) = 3 $ 0 S 3 $ action(3,$) = accept! (assume we do r2 from A:) $ 0 S a b $ goto(0,S) = 3 $ 0 S 3 a b $ action(3,a) = s1 $ 0 S 3 a 1 b $ action(1,b) = r5 (A -> a) $ 0 S 3 A b $ goto(3,A) = 4 C: $ 0 S 3 A 4 b $ action(4,b) = s8 or r4 (A -> SA) (assume we do s8 from C:) $ 0 S 3 A 4 b 8 $ action(8,$) = error (assume we do r4 from C:) $ 0 A b $ goto(0,A) = 9 $ 0 A 9 b $ action(9,b) = s2 $ 0 A 9 b 2 $ action(2,$) = r3 (S -> b) $ 0 A 9 S $ goto(9,S) = 10 $ 0 A 9 S 10 $ action(10,$)= r2 (S -> AS) $ 0 S $ goto(0,S) = 3 $ 0 S 3 $ action(3,$) = accept! ------------------------------------------------------------------------- 9) S -> Aa | bAc | Bc | bBa A -> d B -> d ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ LR(1) Parser State 0: [ S -> *Aa , $ ] [ S -> *bAc , $ ] [ S -> *Bc , $ ] [ S -> *bBa , $ ] [ A -> *d , a ] [ B -> *d , c ] State 1: Goto(State 0, d) = [ A -> d* , a] [ B -> d* , c] State 2: Goto(State 0, b) = [ S -> b*Ac , $ ] [ S -> b*Ba , $ ] [ A -> *d , c ] [ B -> *d , a ] State 3: Goto(State 2, d) = [ A -> d* , c ] [ B -> d* , a ] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ LALR(1) Parser Merge lookaheads for State 1 and State 3 in LR(1) parser to create new state: State = Merge (State 1, State 3) = [ A -> d* , a ] [ A -> d* , c ] [ B -> d* , c ] [ B -> d* , a ] Reduce/reduce conflict for lookahead "a" and "c" (i.e., can't decide whether to reduce "d" to A or B) ------------------------------------------------------------------------- 10 a) Grammar Attribute Rules E -> E' + T if ((E'.type == integer) && (T.type == integer)) E.type = integer; else E.type = float; E -> T E.type = T.type T -> num.num T.type = float T -> num T.type = integer b) "5 + 4 + 3.2 + 1" Parse Tree -> T -> num -> 1 -> num -> 2 / / E --> + --------------------> T --> . \ / \ -> E --> + -> T -> num -> 4 -> num -> 3 \ / -> E --> + \ -> E --> T -> num -> 5 Annotated Parse Tree T.type = int -> num.type = int -> 5 -> num.type = int -> 2 / T.type = float --> . \ -> num.type = int -> 3 ------------------------------------------------------------------------- 11 a) foo => array(0..99, record((a X integer) X (b X integer))) b) bar => (integer X record((a X integer) X (b X integer))) -> pointer(record((a X integer) X (b X integer))) ------------------------------------------------------------------------- 12 a) E1 := E2 + E3 { E1.type.lo = E2.type.lo + E3.type.lo; E1.type.hi = E2.type.hi + E3.type.hi; } | E2 * E3 { /* assuming that all numbers are positive */ E1.type.lo = E2.type.lo * E3.type.lo; E1.type.hi = E2.type.hi * E3.type.hi; } ------------------------------------------------------------------------- 13 Given int a, b; int foo() { int a, c; } int bar() { int a, d; /* HERE */ } a) Show how symbol tables can represent the state of each scope at the point marked HERE foo main -------- ----------| |int a | |int a |<---------|int c | |int b | |------| |int foo()| |int bar()| bar ----------|<---------|------| |int a | |int d | |------| b) what symbols are visible / not visible: visible: a (from bar), d, foo, bar, b not visible: a (from foo), c, a (from main) ------------------------------------------------------------------------- 14 Give the intermediate representations for a * - (b + c) a) syntax tree * / \ a uminus | + / \ b c b) postfix notation a b c + uminus * OR b c + uminus a * c) 3-address code // abstract // load-store RTL t1 = b + c load R1 b t2 = - t1 load R2 c t3 = a * t2 add R3 R1 R2 neg R4 R1 load R5 a mult R6 R5 R4 d) stack code // abstract // Java push a iload index(a) push b iload index(b) push c iload index(c) add iadd neg ineg mult imul