Scalable I/O quarterly report for Q196
		
		     Joel Saltz, Anurag Acharya
		       University of Maryland

We have developed a prototype source to source translation tool that
carries out interprocedural analysis with the goal of replacing large
synchronous I/O operations by corresponding asynchronous
operations. This prototype was developed using Rice's Parascope/D
System.  We have evaluated the efficacy of this tool using two
benchmarks -- a Direct Simulation Monte Carlo code (from NASA Langley)
that generates snapshots of its state and an out-of-core satellite
data processing template based on the Pathfinder program from NASA
Goddard. We found that replacing large synchronous write operations by
corresponding asynchronous operations can reduce the I/O waiting time
by 25-67% and the overall execution time by 14-20%.  A paper
describing this research has been accepted for ICS'96.  (Reference:
"An Interprocedural Framework for Placement of Asynchronous I/O
Operations", Gagan Agrawal, Anurag Acharya and Joel Saltz, To appear
in the ACM International Conference on Supercomputing, May 96).

We are porting the Jovian-2 parallel-I/O library to a cluster of SMPs
(an ATM-connected network of Digital Sable multiprocessors running
Digital Unix) and to a cluster of Pentiums that runs a version of
Linux and is connected by two 100Mb/s ethernets. As was described in
our previous report, Jovian-2 is a multi-threaded parallel-I/O library
developed at the University of Maryland.  It provides an interface
similar to the POSIX lio_listio() interface and can handle both local
and non-local requests. The performance of Jovian-2 for
microbenchmarks and for an AVHRR satellite data processing application
from NASA Goddard on our IBM SP-2 has been described in our paper that
will appear in IOPADS'96 (Reference: "Tuning the Performance of I/O
Intensive Parallel Applications," Anurag Acharya, Mustafa Uysal,
Robert Bennett, Assaf Mendelson, Michael Beynon, Jeff Hollingsworth,
Joel Saltz and Alan Sussman, To appear in the Fourth Annual Workshop
on I/O in Parallel and Distributed Systems (IOPADS), May 96).  As part
of the Jovian effort, we plan to investigate optimizations specific to
parallel architectures and workstation networks composed of multiple
SMP nodes.

We have continued our investigation of data declustering techniques
for multidimensional datasets with the primary goal of facilitating
the interactive exploration of large datasets by minimizing response
time. We have extended the three best-known index-based schemes
(Disk-Modulo, Fieldwise-XOR and Hilbert-Curve) for declustering
Cartesian product files to grid files which allow better utilization
of disk space. Using simulation experiments, we have shown that the
scalability of Disk-Modulo and Fieldwise-XOR for multidimensional
range queries is limited. That is, as the number of disks is increased
beyond a threshold, the response time no longer decreases.  This
result is corroborated by an analytical study.  The response time for
Hibert-Curve scales better than Disk-Modulo or Fieldwise-XOR, but the
difference between its performance and the best possible performance
increases with the degree of skew in the data distribution.  As an
alternative to the index-based schemes, we have developed a
declustering algorithm based on a proximity measure.

To evaluate our new declustering algorithm, we are using both
simulation studies and experimental measurements.  Results from our
simulation experiments indicate that the proposed algorithm achieves
better declustering than the algorithms we compared it to,
particularly for configurations with large number of disks.  A paper
describing this research has been accepted for IPPS'96 (Reference:
"Study of Scalable Declustering Algorithms for Parallel Grid Files",
Bongki Moon and Anurag Acharya and Joel Saltz. To appear in the 10th
International Parallel Processing Symposium, April 96).  We are now
carrying out an experimental evaluation of the declustering
algorithms.  We have recently implemented the deculstering algorithms
on our 16 processor SP-2 and are in the process of evaluating its
performance on three large data sets -- snapshots from the Direct
Simulation Monte Carlo code mentioned above and a 3-D flame simulation
code from NRL and AVHRR image data from NOAA's TIROS satellites.