CMSC 714 Midterm (Fall 2010)


(1) This exam is closed book, closed notes, and closed neighbor.

(2)    You have 70 minutes to complete this exam. If you finish early, you may turn in your exam at the front of the room and leave.

(3)    Write all answers in the supplied exam booklet. Start each new problem (but not sub-problem) on a new page.

(4)    Partial credit will be given for most questions assuming I can figure out what you were doing.

(5)    Please write neatly. Print your answers if your handwriting is hard to read. If you write something, and wish to cross it out, simply put an X through it.

1. (25 points) Define and explain the following terms:

A. happens before

B. reduction operation

C. fat tree

D. data parallelism

E. multi-grid

2. (25 points) You are running a computer center with the following types of users. Consider the architectures we have studied this term and recommend computers for each group. Be user to consider processor design, interconnection network, and programming models.

A. Scientists with iterative solvers using principally dense matrix arithmetic who need to solve a couple of runs per year, but each one takes hundreds of thousands of hours per run

B. Computer Scientists with programs full of pointer based data structures. There are lots of jobs and each uses large amount of memory (10's of GB per run).

C. Engineers running large parameter studies. They have a sequential workload, but need to run thousands of different parameter combinations to reach a solution.

3. (15 points) Memory Systems

A. When tuning an application for a memory system, give two reasons that number memory stall cycles might be different for two different cache misses.

B. On a GPU, cross lane operations can be expensive. Explain what they are and why they are more expensive on GPUs and CPUs.

4. (20 points) Runtime adaptation of computation is often useful to improve program performance.

A. Give two examples of things can be tuned at runtime and what is required from the application to allow these changes to be made.

B. Explain the tradeoffs between having humans in the runtime tuning loop and having this process fully automated.

5. (15 points) Scheduling

A. Why is space sharing and not time sharing the dominate mode of scheduling HPC systems?

B. How does backfilling and poor user estimates of job running time result in improved overall waiting time in a scheduling system?