Progress Report on the Parallel I/O project


Principal Investigator: Joel Saltz
Department of Computer Science
University of Maryland, College Park MD 20742

July 5 1995


IDENTIFYING BENCHMARK PROGRAMS:

  1. pathfinder: this is the workhorse program of the AVHRR Land Pathfinder effort. It processes AVHRR global area coverage (4.4 km resolution) data. It performs calibration, topographic correction, masking of cloudy pixels, navigation and compositing. It does not perform atmospheric correction. Algorithms for the different operations are fixed. The input to the program consists of level 1B sensor data. There are fourteen files per day, each file containing 42-45 megabytes of sensor data. In addition to AVHRR data, it reads several ancillary data files such as a topographic map of the world, a land-sea map of the world, ephemeris data for the days covered by the and satellite-specific information. The output is a twelve band 5004x2168 image of the world which includes, among other things, calibrated versions of the input data and NDVI. The output is generated as an HDF file. Size of the input sensor data is about 600 megabytes, size of the output file is about 220 megabytes. The ancillary data is about 100 megabytes.

  2. climate: this program is designed to produce the Pathfinder AVHRR Land Climate data product. This is a level 3, 8 bit, quasi 10 day, NDVI product on a 360 by 180 degree lat/lon grid. It takes the HDF file generated by pathfinder as input and generates an output climate file (also in HDF format). In addition to the file generated by pathfinder, it also reads an ancillary file which contains a land/sea map of the world. Together with pathfinder, it forms a complete processing chain for earth science data.

  3. gaps: program to process AVHRR global area coverage data. It performs calibration, atmospheric correction, topographic correction, inverse navigation, masking of cloudy pixels, compositing and projection. Multiple algorithms are available for most of these operations. It generates a seven band image of a rectangular window. The sensor data is read in chunks of 400 scans each. Ancillary data is 15-25 megs and is loaded into memory. Size of the output depends on the window. For the standard size, 1216x1168, it is about 20 megabytes. The size of the input sensor data is about 600 megabytes per day.

  4. extract: program to extract windows of one or more bands from the image generated by gaps and to reproject it to a desired projection. It uses the standard USGS prj library to implement the reprojection. In addition to the image generated by gaps, it reads in a ephermeris data file. The size of the output depends on the window and the bands desired.

  5. Land Analysis System: this is an image processing, image analysis and raster GIS system for land analysis. It processes global local area coverage (1.1 km resolution) data. It stitches together the data from the multiple recieving sites, performs calibration, topographic and atmospheric correction, masking of cloudy pixels, navigation and compositing. LAS allows algorithms for different operations to be combined easily. It also allows multiple invokations of a processing program to be run concurrently.

  6. Interactive Image Spreadsheet: this is a program for interactive image processing and browsing of automatically generated data products for the purpose of studying individual phenomena (for example, hurricanes) or for animating sequences of images to display temporal and spatial variation in conditions (for example, rainfall variation, spreading of fires and deserts). For such programs, it is not possible to a priori determine the size or pattern of I/O requests.

  7. The SeaWiFS ocean data processing program: this is an immediate precursor to MODIS ocean color processing and is currently under development. It generates a hierarchy of data products which parallels the hierarchy of data products planned for EOSDIS. We plan to study the end to end I/O requirements of the SeaWiFS data generation. Other earth science applications used to empirically estimate EOS requirements have been land analysis programs. This application will improve coverage by providing information about the requirements of ocean data processing programs. All the data products are generated as HDF files.

  8. Physical-Space Statistical Analysis System (PSAS): the goal is to compute the analysis correction for a short-range forecast of the global atmospheric state using all available observations in a time window centered on the forecast time. The theoretical minimum error variance solution to this problem involves solving a system of linear equation of order of the number of observations in the data window, i.e. a system with 150,000 to 250,000 equations and unknowns. Computationally, PSAS is centered around the solution of a large banded covariance matrix. The matrix is symmetric positive definite and is generated by partitioning the Earth's surface into triangular regions. The banded nature of the matrix arises from the fact that the correlation between regions far apart is negligible. The size of this matrix is estimated to be about 32 gigabytes. Currently, the matrix elements are not stored on disk, instead they are regenerated as needed in every iteration.


STATUS OF BENCHMARK PROGRAMS:

  1. pathfinder: parallelization has been completed. The parallelization strategy is coarse-grained and fully asynchronous -- there is no communication between the processors. Unlike the experiments reported on by Prasad et al, the whole program is parallelized including the combination step that they had not included. We have developed a scheme to partition the orbit files such that each processor reads all the data that it needs. Given the common structure of different sensor data processing programs, we believe that this scheme is applicable to other sensor data processing programs. Given the asynchronous nature of the scheme, it will work as well on shared memory as on distributed memory machines and should scale well as far as computation is concerned. A parallel version of this program runs on the 16 processor IBM SP-2 at University of Maryland. We are in the process of measuring the performance of the parallelized version with sequential I/O as parallel I/O. We are using the new version of our parallel I/O library for the latter.

  2. climate: we are in the process of parallelizing this program.

  3. gaps: we have optimized this program and it now runs about four times faster than the original version we had recieved. We have also determined a representative set of values for its parameters. (This program has about 70 parameters, many of which control the operations performed and the algorithms used for them.) Currently, we are in the process of parallelizing this program. We are using the same scheme for parallelizing this program as used for pathfinder. Given the inherent similarity in sensor data processing programs, we believe that this parallelization scheme coule be applicable to a wide variety of such programs and could, in fact, be the base for a parallel version of the Science Data Processing toolkit that is currently being developed by Hughes.

  4. gaps: we have optimized this program and it now runs about four times faster than the original version we had recieved. We have also determined a representative set of values for its parameters. (This program has about 70 parameters, many of which control the operations performed and the algorithms used for them.) Currently, we are in the process of parallelizing this program. We are using the same scheme for parallelizing this program as used for pathfinder. Given the inherent similarity in sensor data processing programs, we believe that this parallelization scheme coule be applicable to a wide variety of such programs and could, in fact, be the base for a parallel version of the Science Data Processing toolkit that is currently being developed by Hughes.

  5. extract: we are in the process of parallelizing this program.

  6. Land Analysis System: we have acquired the source code for this application and have ported it to our environment. This is a huge body of code. We are in the process extracting suitable processing scenarios. The new summer intern is working on this. We plan to travel to EDC sometime in July to:
    a. obtain data sets
    b. understand the interaction between the different parts
    c. consult on the representative nature of the various usage scenarios.

  7. Interactive Image Spreadsheet: We have acquired access to the 912 facility and have begun experimenting with the IIS. We are currently in the process of designing benchmark programs that implement the representative scenarios mentioned in the previous reports. We have identified existing programs from which suitable kernels can be extracted. We have also identified corresponding datasets. Initially, we were considering using the Codevision profiling tool available on SGIs to help determine the IO requirements. After a small number of experiments, we have determined that the information available from Codevision is not adequate for our purposes and have decided to use Pablo instead. Most of this work has been after hours as the 912 facility is heavily used.

  8. The SeaWiFS ocean data processing program: We have acquired the source code for this program and are in the process of installing it at UMD. We have also acquired latest simulated data available.

  9. Physical-Space Statistical Analysis System (PSAS): we are still trying to arrange access to this application. The goal in this case would be to determine whether it is possible to achieve a data transfer rate that is faster than recomputation. For example, if individual computation takes 500 flops and the computational rate is 100MFLOPS/s (which is quite respectable for a sustained computation), an IO rate of 8 Megs/s is required (assuming double precision values). We have measured about 7 Megs/s substained user data rate with a single fast IBM SCSI disk. With multiple disks on a fast wide SCSI controller, 8 megs/s appears feasible, especially for the large data sets.


PARALLEL I/O LIBRARY:

Analysis of the programs studied so far indicates that the I/O requests in these programs are coarse-grained and regular. Descriptions of other sensor-data processing programs, including SeaWiFS, SeaWinds and LAS, indicate similar I/O request patterns. Given the common structure of ECS programs, we speculate that most, if not all, such programs are likely to have similar I/O patterns. This implies that two-phase I/O is unlikely to be beneficial for these programs. We have designed a version of our parallel I/O library that is suitable for handling such I/O patterns. The interface for this library is an extension of the POSIX lio_listio interface. We have implemented this version on the 16 processor IBM SP-2 at the University of Maryland. To provide efficient access to the high performance switch, the library has been multithreaded to collocate the client and the server within the application processes. We are currently in the process of evaluating this implementation for synthetic and real workloads.


This goes back to the CHAOS home page

Questions about the system or webserver: webmaster@cs.umd.edu
Problems with homepage: wes@cs.umd.edu