Scalable I/O quarterly report for Q196 Joel Saltz, Anurag Acharya University of Maryland We have developed a prototype source to source translation tool that carries out interprocedural analysis with the goal of replacing large synchronous I/O operations by corresponding asynchronous operations. This prototype was developed using Rice's Parascope/D System. We have evaluated the efficacy of this tool using two benchmarks -- a Direct Simulation Monte Carlo code (from NASA Langley) that generates snapshots of its state and an out-of-core satellite data processing template based on the Pathfinder program from NASA Goddard. We found that replacing large synchronous write operations by corresponding asynchronous operations can reduce the I/O waiting time by 25-67% and the overall execution time by 14-20%. A paper describing this research has been accepted for ICS'96. (Reference: "An Interprocedural Framework for Placement of Asynchronous I/O Operations", Gagan Agrawal, Anurag Acharya and Joel Saltz, To appear in the ACM International Conference on Supercomputing, May 96). We are porting the Jovian-2 parallel-I/O library to a cluster of SMPs (an ATM-connected network of Digital Sable multiprocessors running Digital Unix) and to a cluster of Pentiums that runs a version of Linux and is connected by two 100Mb/s ethernets. As was described in our previous report, Jovian-2 is a multi-threaded parallel-I/O library developed at the University of Maryland. It provides an interface similar to the POSIX lio_listio() interface and can handle both local and non-local requests. The performance of Jovian-2 for microbenchmarks and for an AVHRR satellite data processing application from NASA Goddard on our IBM SP-2 has been described in our paper that will appear in IOPADS'96 (Reference: "Tuning the Performance of I/O Intensive Parallel Applications," Anurag Acharya, Mustafa Uysal, Robert Bennett, Assaf Mendelson, Michael Beynon, Jeff Hollingsworth, Joel Saltz and Alan Sussman, To appear in the Fourth Annual Workshop on I/O in Parallel and Distributed Systems (IOPADS), May 96). As part of the Jovian effort, we plan to investigate optimizations specific to parallel architectures and workstation networks composed of multiple SMP nodes. We have continued our investigation of data declustering techniques for multidimensional datasets with the primary goal of facilitating the interactive exploration of large datasets by minimizing response time. We have extended the three best-known index-based schemes (Disk-Modulo, Fieldwise-XOR and Hilbert-Curve) for declustering Cartesian product files to grid files which allow better utilization of disk space. Using simulation experiments, we have shown that the scalability of Disk-Modulo and Fieldwise-XOR for multidimensional range queries is limited. That is, as the number of disks is increased beyond a threshold, the response time no longer decreases. This result is corroborated by an analytical study. The response time for Hibert-Curve scales better than Disk-Modulo or Fieldwise-XOR, but the difference between its performance and the best possible performance increases with the degree of skew in the data distribution. As an alternative to the index-based schemes, we have developed a declustering algorithm based on a proximity measure. To evaluate our new declustering algorithm, we are using both simulation studies and experimental measurements. Results from our simulation experiments indicate that the proposed algorithm achieves better declustering than the algorithms we compared it to, particularly for configurations with large number of disks. A paper describing this research has been accepted for IPPS'96 (Reference: "Study of Scalable Declustering Algorithms for Parallel Grid Files", Bongki Moon and Anurag Acharya and Joel Saltz. To appear in the 10th International Parallel Processing Symposium, April 96). We are now carrying out an experimental evaluation of the declustering algorithms. We have recently implemented the deculstering algorithms on our 16 processor SP-2 and are in the process of evaluating its performance on three large data sets -- snapshots from the Direct Simulation Monte Carlo code mentioned above and a 3-D flame simulation code from NRL and AVHRR image data from NOAA's TIROS satellites.