CMSC 714 – High Performance Computing

Fall 2011 - OpenMP Programming Assignment

Due Wednesday, October 12, 2011 @ 6:00 PM

The purpose of this programming assignment is to gain experience in writing OpenMP programs.  You will start with a working serial program (quake.c) that models an earthquake and add OpenMP directives to create a parallel program.

HINTS

The goal is be systematic in figuring out how to parallelize this program.  You should start by using the gprof command to figure out what parts of the program take the most time (to use gprof, you will need to compile your program with the -pg switch).  From there you should examine the loops in the most important subroutines and figure out how to add OpenMP directives.

The programs will be run on an Intel Xeon 24-core shared memory machine (called buzz.cs.umd.edu).  Your account names (and initial passwords) have already been handed out.

WHAT TO TURN IN

You should submit your program and the times to run it on the input file quake.in (for 1, 2, 4, 8 and 16 threads).  Since quake runs for a while on this input dataset for small numbers of threads, for your testing another input file that runs for much less time is in quake.in.short .  So that you don't have to make copies of the somewhat large input data files, they are available on buzz in ~als/public/714/OpenMP/data .  A copy of the serial quake.c is also available in ~als/public/714/OpenMP/src .

You also must submit a short report about the results (1-2 pages) that explains:

Using  OpenMP

To compile openMP we will be using gcc version 4.4, which nicely has openMP support built in. In general, you can compile this assignment with:

$ gcc44 -fopenmp -pg -o quake quake.c -lm

Please note the "44" at the end of gcc; we have to add this because the default paths on buzz have us using an older version of gcc that does not support openMP well. The -fopenmp tells the compiler to, you guessed it, recognize OpenMP directives. -lm is required because our program uses the math library. -pg needs to be added to collect profiling data when the program is run; you can remove this option before you do final performance testing.

The environment variable OMP_NUM_THREADS sets the number of threads (and presumably processors) that will run the program.  Set the value of this environment variable in the shell window you are about to run the program from. It defaults to using all of them, on buzz that means 24.

RUNNING THE PROGRAM

Quake reads its input file from standard input, and produce its output on standard output.   Quake generates an output message periodically (every 30 of its simulation time steps), so you should be able to tell if it is making progress.

GRADING

The project will be graded as follows:    

Item

Pct

Correctly runs with 1 thread

10 %

Correctly runs with 16 threads

40%

Performance with 1 thread

10%

Speedup of parallel version

20%

Writeup

20%