CMSC 714 (Fall 2002)

Tentative Reading List



9/3 Parallel Computing and Parallel Computers

Lecture Notes

9/5 Applications of Parallel Computing

Lecture Notes

Programming Models

9/10 Distributed Shared Memory

K. Li and P. Hudak, "Memory Coherence in Shared Virtual Memory Systems", ACM Transactions on Computer Systems, 7(4), Nov. 1989, pp. 321-359 (PDF).

Pete Keleher, Alan L. Cox, Sandhya Dwarkadas, Willy Zwaenepoel, "An Evaluation of Software Based Release Consistent Protocols", JPDC, 29(2), Sept. 1995, pp 126-141. (Postscript)

9/12  Expressing Parallelism (Explicit Control)

"The PVM Concurrent Computing System: Evolution, Experiences, and Trends", (Postscript)

J. J. Dongarra, S. W. Otto, M. Snir, and D. Walker, "A message passing standard for MPP and workstations," CACM, 39(7), 1996, pp. 84-90. (PDF)

9/17 Expressing Parallelism (Implicit Control)

William W. Carlson , et al, “Introduction to UPC and Language Specification”, CCS-TR-99-157, (PDF)

L. Dagum and R. Menon, "OpenMP: An Industry-Standard API for Shared-Memory Programming," IEEE Computational Science & Engineering, 5(1), 1998, pp. 46-55. (PDF)

9/19 Expressing Parallelism (Data Layout)

"Compiling HPF for Distributed Memory MIMD Computers", (Postscript)

“14.9 TFLOPS Three-dimensional Fluid Simulation for Fusion Science with HPF on the Earth  Simulator,”  Hitoshi Sakagami, Hitoshi Murai, Yoshiki Seo, Mitsuo Yokokawa, to Appear SC’02 (PDF)



9/24 Shared Memory

Laudon, J., Lenoski, D., “The SGI Origin: a ccNUMA highly scalable server”, ISCA '97, pp. 241-51, May 1997 (PDF)

Alan E Charlesworth , “The Sun Fireplane System Interconnect “, Proceedings of SC’01, Nov. 2001. (PDF)

9/26 Message Passing and Communication

Fabrizio Petrini. Wu-chun Feng,  Adolfy Hoisie, Salvador Coll, Eitan Frachtenberg, “The Quadrics Network: High-Performance Clustering Technology,” IEEE Micro Jan-Feb 2002, pp. 46-57. (PDF)

S. L. Scott, "Synchronization and Communication in the T3E Multiprocessor", Proc. ASPLOS VII, Cambridge, MA, Oct. 1996 (Postscript)

10/1 Vectors and Threading

Gail Alverson, Preston Briggs, Susan , Simon Kahan, Richard Korry, “Tera hardware-software cooperation”, SC’97, Nov. 1997, (PDF)


10/3 Computational Grids

Grid book, Chapters 1-2


10/8 – No class


10/10 Event Ordering

L. Lamport, "Time, Clocks, and the Ordering of Events in a Distributed System," CACM, 21(7), 1978, pp. 558-564 (PDF).

Netzer, R. H. B. and Miller, B.P., "What are Race Conditions? Some Issues and Formalizations", LOPLAS 1(1), March 1992. (PDF)

10/15 Race Detection

S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson, "Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs," Proceedings of the 16th Symposium on Operating Systems Principles (PDF).

A. Dinning and E. Schonberg, “An empirical comparison of monitoring algorithms for access anomaly detection,”  Second ACM SIGPLAN symposium on Principles & practice of parallel programming, 1990, Pages 1 – 10 (PDF).


10/17 Performance Metrics

A. J. Goldberg and J. L. Hennessy, "Performance Debugging Shared Memory Multiprocessor Programs with MTOOL", Supercomputing'91. Nov. 18-22, 1991, Albuquerque, NM, pp. 481-490 (PDF).

J. K. Hollingsworth, "Critical Path Profiling of Message Passing and Shared-memory Programs," IEEE Transactions on Parallel and Distributed Computing, 9(10), 1998, pp. 1029-1040. (PDF).

10/22 Midterm Exam


10/24 Data Collection and Instrumentation (Jeffrey Odom)

J. R. Larus and E. Schnarr, "EEL: Machine-Independent Executable Editing", In Proceedings of the 1995 SIGPLAN Conference on Programming Language Design and Implementation, pages 291-300, June 1995. (Postscript).

B. R. Buck and J.K. Hollingsworth , “An API for Runtime Code Patching,” Journal of High Performance Computing Applications, 14 (4) (Winter 2000), pp. 317-329. (PDF)


10/29 Scheduling - Short Term (Lakshmi Srinivasan)

John K Ousterhout, "Scheduling Techniques for Concurrent Systems", International Conference on Distributed Computing Systems, 1982, pp 22-30.  (PDF).

A. C. Dusseau, R. H. Arpaci, D. E. Culler, "Effective Distributed Scheduling of Parallel Workloads", ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, May 1996, Philadelphia, PA. (PDF).


10/31  Performance Tools (Sandro Fouche)

L. A. De Rose, D. A. Reed, “SvPablo: A Multi-Language Architecture-Independent Performance Analysis System, Proceedings of the 1999 International Conference on Parallel Processing  (PDF).

B. P. Miller, M. D. Callaghan, J. M. Cargille, J. K. Hollingsworth, R. B. Irvin, K. L. Karavanic, K. Kunchithapadam, and T. Newhall, "The Paradyn Parallel Performance Measurement Tools", IEEE Computer, Nov. 1995. 28(11), pp. 37-46. (PDF)

11/5 Computational Steering (Suresh Aryangat)

W. Gu, G. Eisenhauer, E. Kraemer, K. Schwan, J. Stasko, J. Vetter, and N. Mallavurupu, "Falcon: On-line Monitoring and Steering of Large-Scale Parallel Programs," Frontiers '95. Feb 6-9, 1995, McLean, VA, IEEE Press, pp. 422-429. (Postscript)

R. L. Ribler, J. S. Vetter, H. Simitci, and D. A. Reed, "Autopilot: Adaptive Control of Distributed Applications," High Performance Distributed Computing, Chicago, IL, pp. 172-9, 1998 (PDF).

11/12 Cache Tools (Nick Petroni)

John Mellor-Crummey, David Whalley, Ken Kennedy,  Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings,” International Journal of Parallel Programming, 29(3), June 2001. (PDF)

Margaret Martonosi, Anoop Gupta, Thomas Anderson, “MemSpy: analyzing memory system bottlenecks in programs”, SIGMETRICS 92, (PDF)


11/7 Resource Aware Applications (Brinda Ganesh)

Grid Book - Chapter 12

B. D. Noble, M. Satyanarayanan, D. Narayanan, J. E. Tilton, J. Flinn, and K. R. Walker, "Agile Application-Aware Adaptation for Mobility," Proceedings of the 16th ACM Symposium on Operating Systems Principles. Oct. 1997. (PDF)

OS Issues

11/14 Scheduling - Long Term (Lorin Hochstein)

D. G. Feitelson and A. M. a. Weil, "Utilization and Predictability in Scheduling the IBM SP2 with Backfilling," 2th Intl. Parallel Processing Symposium. April 1998, Orlando, Florida, pp. 542-546. (Use this extended form – (PDF)

11/19 Grid OS Support (Kursad Albayraktaroglu)

M. Litzkow, M. Livny, and M. Mutka, "Condor - A Hunter of Idle Workstations," International Conference on Distributed Computing Systems. June 1988, pp. 104-111. (PDF).

K. D. Ryu, J. K. Hollingsworth, and P. Keleher , “:Efficient Network and I/O Throttling for Fine-Grain Cycle Stealing ,”  SC'01 November 2001 (PDF).

Grid Book - Chapter 11

11/21 Parallel I/O (Polyvios Pratikakis)

Terry Jones, Alice Koniges and R. Kim Yates, “Performance of the IBM General Parallel File System,” 14th International Parallel and Distributed Processing Symposium (IPDPS'00), (PDF)

A. Acharya, M. Uysal, and J. Saltz, "Active Disks: Programming Model, Algorithms and Evaluation," Eighth International Conference on Architectural Support for Programming Languages and Operating Systems. Oct.1998, San Jose, CA. (PDF)

11/26 Work in Progress session

12/3 Performance Prediction (Brian Krznarich)

M. E. Crovella, Thomas J. LeBlanc, "Parallel Performance Prediction Using Lost Cycles", Proceedings of Supercomputing '94, 1994. (Postscript)

E. Deelman, et al., "Poems: end-to-end performance design of large parallel adaptive computational systems," WOSP: International Workshop on Software and Performance. Oct. 1998, Santa Fe, NM, pp. 18-30. (PDF)

Commercial Applications


12/5 High Performance Web Servers (Rakesh B Bobba)

A. Fox, S.D. Gribble, Y. Chawathe, E.A. Brewer, P. Gauthier, “Cluster-based scalable network services,” SOSP’97, pp. 78-91 (PDF).

D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine and D. Lewin , “Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web,” STOC’97, pp. 654-663  (PDF)

12/10 Project Presentations

12/12 Project Presentations