Report for Scalable I/O, July 1996

Scalable I/O quarterly report for Q296

Joel Saltz Anurag Acharya


In this quarter, we have made progress on four fronts. First, we have completed the ports of our Jovian-2 parallel-I/O library to a cluster of DEC Alpha workstations at the University of Maryland and to the Beowulf cluster of Pentium PCs running Linux at the NASA Goddard Space Flight Center. The Alpha cluster is a network of four-processor symmetric multi-processor Digital Alpha Server 2100 4/275 workstations running Digital Unix 3.2. Each workstation contains four 275 MHz Alpha processors, 256 Mbytes of memory, and 4 Gbytes of disk space. The workstations are connected by Digital's Gigaswitch/ATM communications hub. Ports to both these platforms use MPI, and pthreads. We are currently in the process of porting Pathfinder and Climate, our benchmark satellite data processing programs, to both these platforms. The port to the Alpha cluster proved to be much more challenging than we had anticipated because of subtle synchronization errors that arise when separate processors are available to simultaneously run the threads used to carry out computation, local I/O and communication associated with non-local I/O requests. On the other hand, we think that the shared memory nodes will give us the scope to carry out optimizations that would not be possible on a machine composed of uniprocessor nodes. The port to the Beowulf cluster was, in principal straightforward. However, to carry out the port, we had to identify and port a version of MPI that would operate reliably on Beowulf. This infrastructural work was probably worth the effort - ARPA has apparently decided to fund CESDIS to develop a new version of Beowulf, built out of 64 PentiumPro nodes.

Second, we led the effort within the consortium to define the Scalable I/O High-level API. We identified a set of features suitable for such an API and evaluated three proposed APIs, the PFS API from Intel, the PIOFS API from IBM and the MPI-IO API from the MPI-IO group (our report is available at http://www.cs.umd.edu/projects/hpsl/io/sio-hlapi/currentAPIs.ps). Based on this report and a series of discussions, the consortium has selected the MPI-IO API. We are currently participating in the process of refining the MPI-IO interface to make it easier to use as well as to make it available to both MPI and non-MPI users. Further details on the effort to define the Scalable I/O High-level API can be found on our web site at http://www.cs.umd.edu/projects/hpsl/io/sio-hlapi/index.html.

Third, we have developed two I/O templates that represent the I/O and computation patterns of Pathfinder and Climate. Together, they form a chain that represents the processing from level 1b to level 3. We have used a preliminary version of the Pathfinder template in our compiler effort. We are in the process of validating these templates and hope to make them available for distribution in near future. Finally, we have used our techniques for declustering multi-dimensional datasets (see previous report for details) distributed 30~GB of AVHRR data on our SP-2. Our distribution improved the disk parellelism, the number of disks active for individual queries by 48 to 70 percent. The total estimated retrieval time was reduced by between 8 and 33 percent. We also evaluated schemes for placement of data blocks assigned to a single disk. We found that the average length of a read (without an intervening seek) can be improved by about a factor of two. These experiments are currently in progress. We will describe these experiments in detail in a forthcoming paper and future reports.


Questions about the system or webserver: webmaster@cs.umd.edu
Problems with homepage: wes@cs.umd.edu