High Performance Computing Systems (CMSC714)

Group Projects

The group project should be implemented in C/C++ (or Fortran), and use a parallel programming model such as MPI, OpenMP, Charm++, CUDA etc. This project will be done in teams of ~3 people. The final deliverable will be a report and the code (with a Makefile and clear instructions for running it).

Milestone Due on
Form groups March 4
Project description March 11
Interim report April 15
Project presentations May 6, 11
Final project and report May 13
Project Grading (25% of overall grade)
% total
Presentation40
Final report and code40
Peer evaluation20

Peer evaluation: you are given $100 that you have to allocate as a performance bonus to your group members based on your assessment of their contributions to the project (you cannot keep any money for yourself). However, you can donate money to charity if you'd like. Each person should email Abhinav and Joy explaining how you are distributing your virtual dollars (100) among your teammates with justification. Email subject: CMSC714: Group X: Peer Evaluation

What goes in the presentation?
  • Introduce your project so that it is understandable by a CS audience
  • Present what you are implementing or evaluating (serial / parallel algorithms)
  • Progress so far
  • Results (performance / performance analysis)
What goes in the report?
  • Details about the project: serial algorithm, parallel algorithm, languages being used
  • Deliverables and metrics for success
  • Results
  • Contributions of individual group members

Recent Projects

  • Parallelizing a Gaussian Solver
  • Parallelizing the Spectral Clustering Algorithm
  • Parallelize Raytracing
  • Parallelize K-Means Clustering
  • Simulating the spread of COVID-19 in closed environments
  • Parallel A* Search Algorithm
  • Data Parallelism for Distributed Deep Learning
  • Parallelization of Sudoku
  • A Consortium of Parallel QuickHull
  • Parallel Implementation of GMRES Solvers
  • Distributed Training of Neural Networks
  • Parallel N-Body Simulations with Long Range Interactions
  • Auto-tuning for scalable parallel 3-D FFT
  • Algebraic Multigrid with OpenMP, OpenACC and MPI
  • A Visual Debugging Tool for MPI Programs
  • Online Auto-Tuning of Collective Communication Operations in MPI
Other suggestions
  • Implement a parallel algorithm such as sorting or matrix multiply.
  • Application performance studies across one or more parallel machines - e.g. satellite data processing, parallel search, computer vision algorithms, bioinformatics
  • Application performance studies on GPUs
  • Reproduce results from a paper, extend to current systems - e.g., CPU vs. GPU paper (pick a small number of application kernels)
  • Debunking a published paper