Principal Investigators

Mustafa Uysal
Anurag Acharya
Joel Saltz

Software Distribution

  • Trace Utilities

    Related Information

  • Publication List
  • Applications for Measurement and Benchmarking of I/O on Parallel Computers

    Until recently, most applications developed for parallel machines avoided I/O as much as possible (distributed databases have been a notable exception). Typical parallel applications (usually scientific programs) would perform I/O only at the beginning and the end of execution with the possible exception of infrequent checkpoints. This has been changing: I/O-intensive parallel programs have emerged as one of the leading consumers of cycles on parallel machines. This change has been driven by two trends. First, parallel scientific applications are being used to process larger datasets that do not fit in memory. Second, a large number of parallel machines are being used for non-scientific applications, for example databases, data mining, web servers for busy web sites (e.g. Altavista and NCSA). Characterization of these I/O intensive applications is an important problem that has tremendous effect on the design of I/O subsystems, operating systems and filesystems.

    To this end, we have traced seven parallel I/O-intensive applications. These applications were run on eight nodes of an IBM SP-2. We used the AIX trace utility to trace I/O-related system calls (open, close, read, write and seek). We also captured all message-passing activity and context-switches. This allowed us to accurately compute the inter-arrival times for I/O requests and to better understand the application behavior. Some characteristics of these traces have been described in University of Maryland Technical Report:

      Mustafa Uysal, Anurag Acharya, and Joel Saltz. Requirements of I/O Systems for Parallel Machines: An Application-driven Study. Technical Report, CS-TR-3802, University of Maryland, College Park, May 1997.

    We are making these traces available for the use of other researchers. The traces are in ASCII. We provide a description of the trace format; utility programs to convert to/from a binary format; and library routines to access the trace records in binary format. For each of the applications, we provide a brief description of the application itself, the input dataset and the workload.

    Non-scientific ApplicationsScientific Applications
    DB2 Parallel EditionTitan
    Data MiningLU Factorization
    Parallel Web ServerSparse Cholesky Factorization
    Parallel Text Search