Name:

## Midterm 2

CMSC 430 Introduction to Compilers Spring 2012

April 18, 2012

## Instructions

This exam contains 10 pages, including this one. Make sure you have all the pages. Write your name on the top of this page before starting the exam.

Write your answers on the exam sheets. If you finish at least 15 minutes early, bring your exam to the front when you are finished; otherwise, wait until the end of the exam to turn it in. Please be as quiet as possible.

If you have a question, raise your hand. If you feel an exam question assumes something that is not written, write it down on your exam sheet. Barring some unforeseen error on the exam, however, you shouldn't need to do this at all, so be careful when making assumptions.

| Question | Score | Max |
|----------|-------|-----|
| 1        |       | 28  |
| 2        |       | 15  |
| 3        |       | 22  |
| 4        |       | 35  |
| Total    |       | 100 |

## Question 1. Short Answer (28 points).

**a.** (7 **points**) Explain briefly what *bitvectors* are used for in data flow analysis, and why they may be worth using.

**b.** (7 points) Explain briefly what an *activation record* is, and list four items an activation record is likely to contain.

c. (7 points) In class, we discussed some recent tools that use dataflow analysis to find bugs in programs (rather than for optimization). Explain what *false positives* and *false negatives* are in this context. (You will not lose any points if you swap the definitions of these terms.)

d. (7 points) Explain in at most 3 sentences what dynamic software updating is.

Question 2. Definite assignment analysis (15 points). A variable is *definitely assigned to* at a program point if it is guaranteed to have been be written to (possibly more than once) on all paths from the start of the program to that point.

a. (4 points) What C programming mistake could we use definite assignment analysis to detect?

a. (8 points) Suppose we want to implement definite assignment analysis as a data flow analysis.

i. What is the *direction* of this analysis?

ii. Is this a *may* or a *must* analysis?

- iii. What are entry\_or\_exit\_facts for this analysis, in general terms?
- iv. What are the initial\_facts for this analysis, in general terms?

| b. (3 points) Write down Gen and Kill sets for the | e following statements for definite assignment analysis. |
|----------------------------------------------------|----------------------------------------------------------|
|----------------------------------------------------|----------------------------------------------------------|

| Statement | Gen | Kill |
|-----------|-----|------|
| a = b     |     |      |
|           |     |      |
| a = b + c |     |      |
|           |     |      |
| a = a + 1 |     |      |
|           |     |      |

This page intentionally left blank.

Question 3. Data flow analysis (22 points). Consider the following control-flow graph.



a. (5 points) Write down the Gen/Kill sets for live variable analysis for the following statements in the control flow graph. Write  $\emptyset$  for the empty set.

| Statement | Gen | Kill |
|-----------|-----|------|
| 1         |     |      |
| 3         |     |      |
| 6         |     |      |
| 8         |     |      |
| 9         |     |      |

**b.** (12 points) For each statement in the control-flow graph, show the results of available expressions, i.e., show the set of expressions that are available at the *end* of the statement. Write  $\emptyset$  for the empty set.

| Statement | Expressions |
|-----------|-------------|
| 1         |             |
| 2         |             |
| 3         |             |
| 4         |             |
| 5         |             |
| 6         |             |
| 7         |             |
| 8         |             |
| 9         |             |
| 10        |             |
| 11        |             |
| 12        |             |

c. (5 points) Write down the set of statements s such that the **definition** at statement 4 may reach the end of statement s. (Don't forget to include statement 4 in your list.)

Question 4. Code generation, register allocation, and optimization (35 points). Below is a snippet of codegen-2.ml from class, showing the input expression language, the "bytecode" instruction language, and compilation. (Here we've renamed 'L\_Register to 'L\_Reg to save some writing.)

a. (10 points) Suppose we extended the source language with a *while loop*:

type expr = ... | EWhile of expr \* expr

Here, EWhile(e1,e2) has the usual semantics, where e1 is the loop guard—which is considered false if 0 and true otherwise—and e2 is the loop body. Loops should be evaluated for their side effects only, and no matter how many times (0 or more) the loop body is evaluated, all while loops should return the value 0.

Write a case of comp\_expr that handles while loops. (Hint: recall that the IJmp and IIfZero instructions add their offset to the program counter for the *following* instruction; offsets may be negative.)

| EWhile (e1, e2)  $\rightarrow$ 

b. (15 points) Next, consider the bytecode program in the left column below.

In the second column below, list the numbers of registers that are live *after* each instruction—we've given a couple of examples to get you started.

Now suppose that we have three phyiscal registers available, *ra*, *rb*, and *rc*. In the rightmost column group below, assign each virtual register to a physical register, and indicate what virtual registers you need to *spill after* executing an instruction or *load before* executing an instruction. For example, if you had an instruction IAdd('L\_Reg 0, 'L\_Reg 1, 'L\_Reg 2), you could list 0 in the spill column and 1,2 in the load column if you needed to spill and load all possible registers. (Don't forget to also assign any loaded virtual registers to physical registers.)

You may use any spill policy you choose, but you should try to maximize register usage—don't load and spill all registers with every instruction.

|                                     |                | Reg. Assignment |    |               |        |       |
|-------------------------------------|----------------|-----------------|----|---------------|--------|-------|
| Instruction                         | Live registers | ra              | rb | $\mathbf{rc}$ | spills | loads |
| ILoad ('L_Reg 0, 'L_Int 1)          |                |                 |    |               |        |       |
| ILoad ('L_Reg 1, 'L_Int 42)         |                |                 |    |               |        |       |
| IAdd ('L_Reg 2, 'L_Reg 0, 'L_Reg 1) |                |                 |    |               |        |       |
| ILoad ('L_Reg 3, 'L_Int 2)          |                |                 |    |               |        |       |
| ILoad ('L_Reg 4, 'L_Int 41)         |                |                 |    |               |        |       |
| IMul ('L_Reg 5, 'L_Reg 4, 'L_Reg 2) |                |                 |    |               |        |       |
| IAdd ('L_Reg 6, 'L_Reg 3, 'L_Reg 5) |                |                 |    |               |        |       |
| IAdd ('L_Reg 7, 'L_Reg 2, 'L_Reg 6) |                |                 |    |               |        |       |
| IAdd ('L_Reg 8, 'L_Reg 7, 'L_Reg 0) | 8              |                 |    |               |        |       |
| IStore ('L_Ptr 3, 'L_Reg 8)         | None           |                 |    |               |        |       |

c. (10 points) For each of the following optimizations, write down a short sequence of instr s as they might appear before and after the optimization. You may write 'L\_Reg n as rn and 'L\_Int n as n. For example, for dead code elimination, you might write

| Before         | After          |
|----------------|----------------|
| ILoad (r0, 42) | ILoad (r0, 13) |
| ILoad (r0, 13) |                |

i. Common subexpression elimination

ii. Constant folding

iii. Copy propagation