Introduction to Parallel Computing (CMSC416)

Assignment 6: MPI (extra credit)

Due: December 6, 2023 @ 11:59 PM Eastern Time

For this assignment, you will implement a 2D decomposition of Assignment 1 in MPI using non-blocking routines. You will add two arguments to the command line at the end that allow providing the X and Y dimensions of the MPI virtual grid. As an example, if we are running with 64 MPI processes, we should be able to use a 8x8 or 16x4 or 4x16 virtual grid of processes. The modified command line will like this:


          mpirun -np <# of processes> ./life <data-file-name> <# of generations> <X_limit> <Y_limit> <# of processes in X> <# of processes in Y>
          

What to Submit

You must submit the following files and no other files:

  • life-nonblocking.[c,C,f77,f90]: parallel version using non-blocking Isend/Irecv routines, where the file extension depends on the language used for the implementation
  • Makefile that will compile your code successfully on zaratan when using mpicc or mpicxx. Make sure that the executable name is life-nonblocking, and do not include the executable in the tarball.
  • You must also submit a short report (pdf) with performance results (a single line plot). The line plot should present the execution times to run the parallel version on the input file life.1024x1024.data (for 16, 32, 64, and 128 processes), running on a 1024x1024 board for 500 iterations. In the report, you should mention:
    • how was the initial data distribution done
    • what are the performance results, and are they what you expected
You should put the code, Makefile and report in a single directory (named LastName-FirstName-assign6), compress it to .tar.gz (LastName-FirstName-assign6.tar.gz) and upload that to gradescope.

Tips

  • Zaratan primer
  • Use the compiler flag -g while debugging but -O2 when collecting performance numbers.
  • MPI_Wtime example
  • Use the --exclusive flag with sbatch when collecting execution times.

Grading

The project will be graded as follows:

Component Percentage
Runs correctly with 16 (4x4) processes 20
Runs correctly with 64 (8x8) processes 20
Runs correctly with 64 (16x4) processes 25
Runs correctly with 64 (4x16) processes 25
Writeup 10
NOTE: If your program does not compile when submitted on gradescope, you get 0 points. If your program does not run correctly, you do NOT get any points.