Titan

Remotely-sensed data acquired from satellite-based sensors is widely used in geographical, meteorological and environmental studies. A typical analysis processes satellite data for ten days to a year and generates one or more raster images of the area under study. The output images are usually significantly smaller than the input data. For example, a 10-day full-globe analysis over coarse-grained satellite data (4km per pixel) processes approximately 4GB of data to generate a 228MB multi-band image. This data reduction is achieved by composition of information corresponding to different days.

Several database systems have been designed to handle the output raster images generated from analyzing raw satellite data and provide powerful query operations including various forms of spatial joins. However, they are not suitable for managing and processing the raw satellite data. Titan is a parallel shared-nothing database designed to support management and efficient data processing over remote-sensing data. It uses data declustering and placement techniques to fully exploit all the I/O bandwidth provided by a suitably configured disk farm. The system provides low-latency retrieval of very large volumes of spatio-temporal data from secondary storage for efficient data processing. A simplified R-tree is used to efficiently identify the subset of data that corresponds to the region and time periods of interest. Furthermore, Titan integrates data processing and retrieval so that data processing can be performed efficiently on the same machine that the satellite data is stored on. As a result, only output images are communicated to the clients, and not the input data, which is often much larger than the output images. Titan coordinates the operations for data processing and retrieval so that the retrieved satellite data is processed in a pipelined fashion and I/O, communication and computation can be fully overlapped to reduce latency.

Titan is currently operational on the Maryland SP-2, and contains about 24GB of data from the Advanced Very High Resolution Radiometer (AVHRR) on the NOAA-7 satellite. Experimental results have shown that Titan provides good performance for global queries, and interactive response times for local queries. We are currently in the process of a adapting Titan into T2, our generalized infrastructure for building customized parallel database systems, which enables integration of storage, retrieval and processing of multi-dimensional datasets to support a wide range of data-intensive applications.

Related Information:
  • Publication List
  • Active Data Repository Accelerates Access to Large Data Sets NPACI enVision,  Volume 14, Number 2, April - June 1998

Questions? Email us!

Last Updated:  04/20/99