Tuning the Performance of I/O-Intensive Parallel Applications

Anurag Acharya, Mustafa Uysal, Robert Bennett, Assaf Mendelson,
Michael Beynon, Jeff Hollingsworth, Joel Saltz, and Alan Sussman

Fourth Annual Workshop on I/O in Parallel and Distributed Systems, Philadelphia, Pennsylvania, May 27 1996


Getting good I/O performance from parallel programs is a critical problem for many application domains. In this paper, we report our experience tuning the I/O performance of four application programs from the areas of sensor data processing and linear algebra. After tuning, three of the four applications achieve effective I/O rates of over 100MB/s, on 16 processors. The total volume of I/O required by the programs ranged from about 75MB to over 200GB. We report the lessons learned in achieving high I/O performance from these applications, including the need for code restructuring, local disks on every node and overlapping I/O with computation. We also report our experience on achieving high performance on peer-to-peer configurations. Finally, we comment on the necessity of complex I/O interfaces like collective I/O and strided requests to achieve high performance.

Postscript (compressed 137K)

A previous version appeared as CRPC TR95632-S (compressed 153 K)

Last Updated:  03/01/99