CMSC 714 Midterm (Fall 2010)
(1) This exam is closed book, closed
notes, and closed neighbor.
(2) You have 70 minutes to complete this
exam. If you finish early, you may turn in your exam at the front of the room
(3) Write all answers in the supplied exam
booklet. Start each new problem (but not sub-problem) on a new page.
(4) Partial credit will be given for most
questions assuming I can figure out what you were doing.
(5) Please write neatly. Print your
answers if your handwriting is hard to read. If you write something, and wish
to cross it out, simply put an X through it.
1. (25 points) Define and explain
the following terms:
A. happens before
B. reduction operation
C. fat tree
D. data parallelism
2. (25 points) You are running a
computer center with the following types of users. Consider the architectures
we have studied this term and recommend computers for each group. Be user to
consider processor design, interconnection network, and programming models.
A. Scientists with iterative
solvers using principally dense matrix arithmetic who need to solve a couple of
runs per year, but each one takes hundreds of thousands of hours per run
B. Computer Scientists with
programs full of pointer based data structures. There are lots of jobs and each
uses large amount of memory (10's of GB per run).
C. Engineers running large
parameter studies. They have a sequential workload, but need to run thousands
of different parameter combinations to reach a solution.
3. (15 points) Memory Systems
A. When tuning an application for
a memory system, give two reasons that number memory stall cycles might be
different for two different cache misses.
B. On a GPU, cross lane operations
can be expensive. Explain what they are and why they are more expensive on GPUs
4. (20 points) Runtime adaptation of
computation is often useful to improve program performance.
A. Give two examples of things can
be tuned at runtime and what is required from the application to allow these
changes to be made.
B. Explain the tradeoffs between
having humans in the runtime tuning loop and having this process fully
5. (15 points) Scheduling
A. Why is space sharing and not
time sharing the dominate mode of scheduling HPC systems?
B. How does backfilling and poor
user estimates of job running time result in improved overall waiting time in a