You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format. However, this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the Department of Computer Science of the University of Maryland at College Park under terms that include this permission. All other rights are reserved by the author(s).
The Virtual Microscope. Renato Ferreira. Bongki Moon. Jim Humphries. Alan Sussman. Joel Saltz. Robert Miller. Angelo Demarzo. April 1997.
We present the design of the Virtual Microscope, a software system employing a client/server architecture to provide a realistic emulation of a high power light microscope. We discuss several technical challenges related to providing the performance necessary to achieve rapid response time, mainly in dealing with the enormous amounts of data (tens to hundreds of gigabytes per slide) that must be retrieved from secondary storage and processed. To effectively implement the data server, the system design relies on the computational power and high I/O throughput available from an appropriately configured parallel computer. (Also cross-referenced as UMIACS-TR-97-35) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Titan A High-Performance Remote-Sensing Database. Chialin Chang. Bongki Moon. Anurag Acharya. Carter Shock. Alan Sussman. Joel Saltz. August 1996.
There are two major challenges for a high-performance remote-sensing database. First, it must provide low-latency retrieval of very large volumes of spatio-temporal data. This requires effective declustering and placement of a multi-dimensional dataset onto a large disk farm. Second, the order of magnitude reduction in data-size due to post-processing makes it imperative, from a performance perspective, that the postprocessing be done on the machine that holds the data. This requires careful coordination of computation and data retrieval. This paper describes the design, implementation and evaluation of {\em Titan}, a parallel shared-nothing database designed for handling remote-sensing data. The computational platform for Titan is a 16-processor IBM SP-2 with four fast disks attached to each processor. Titan is currently operational and contains about 24~GB of data from the Advanced Very High Resolution Radiometer (AVHRR) on the NOAA-7 satellite. The experimental results show that Titan provides good performance for global queries, and interactive response times for local queries. (Also cross-referenced as UMIACS-TR-96-67) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Analysis of the Clustering Properties of Hilbert Space-filling Curve. March 1996.
Bongki Moon. H.V. Jagadish. Christos Faloutsos. Joel Saltz. Several schemes for linear mapping of multidimensional space have been proposed for many applications such as access methods for spatio-temporal databases, image compression and so on. In all these applications, one of the most desired properties from such linear mappings is clustering, which means the locality between objects in the multidimensional space is preserved in the linear space. It is widely believed that Hilbert space-filling curve achieves the best clustering. In this paper we provide closed-form formulas of the number of clusters required by a given query region of an arbitrary shape (e.g., polygons and polyhedra) for Hilbert space-filling curve. Both the asymptotic solution for a general case and the exact solution for a special case generalize the previous work, and they agree with the empirical results that the number of clusters depends on the hyper-surface area of the query region and not on its hyper-volume. We have also shown that Hilbert curve achieves better clustering than z-curve. From the practical point of view, the formulas given in this paper provide a simple measure which can be used to predict the required disk access behaviors and hence the total access time. (Also cross-referenced as UMIACS-TR-96-20) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Scalability Analysis of Declustering Methods for Cartesian Product. Bongki Moon. Joel Saltz. April 1996.
Efficient storage and retrieval of multi-attribute datasets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multi-attribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files over multiple disks to obtain high performance for disk accesses. Though the scalability of the declustering methods becomes increasingly important for systems equipped with a large number of disks, no analytic studies have been done so far. In this paper we derive formulas describing the scalability of two popular declustering methods Disk Modulo and Fieldwise Xor for range queries, which are the most common type of queries. These formulas disclose the limited scalability of the declustering methods and are corroborated by extensive simulation experiments. From the practical point of view, the formulas given in this paper provide a simple measure which can be used to predict the response time of a given range query and to guide the selection of a declustering method under various conditions. (Also cross-referenced as UMIACS-TR-96-5) Department of Computer Science, Univ. of Maryland, University of Maryland Institute for Advanced Computer Studies,
Study of Scalable Declustering Algorithms for Parallel Grid Files. Bongki Moon. Anurag Acharya. Joel Saltz. February 1996.
Efficient storage and retrieval of large multidimensional datasets is an important concern for large-scale scientific computations such as long-running time-dependent simulations which periodically generate snapshots of the state. The main challenge for efficiently handling such datasets is to minimize response time for multidimensional range queries. The grid file is one of the well known access methods for multidimensional and spatial data. We investigate effective and scalable declustering techniques for grid files with the primary goal of minimizing response time and the secondary goal of maximizing the fairness of data distribution. The main contributions of this paper are (1) analytic and experimental evaluation of existing index-based declustering techniques and their extensions for grid files, and (2) development of a proximity-based declustering algorithm called {\em minimax} which is experimentally shown to scale and to consistently achieve better response time compared to available algorithms while maintaining perfect disk distribution. (Also cross-referenced as UMIACS-TR-96-4) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
A Manual for the CHAOS Runtime Library. Joel Saltz. Ravi Ponnusamy. Shamik D. Sharma. Bongki Moon. Yuan-Shin Hwang. Mustafa Uysal. Raja Das. March 1995.
Procedures are presented that are designed to help users efficiently program irregular problems (e.g. unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial dif- ferential equations solvers) on distributed memory machines. These procedures are also designed for use in compilers for distributed memory multiprocessors. The portable CHAOS pro- cedures are designed to support dynamic data distributions and to automatically generate send and receive messsage by capturing communications patterns at runtime. (Also cross-referenced as UMIACS-TR-95-34) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Bongki Moon. Mustafa Uysal. Joel Saltz. Index Translation Schemes for Adaptive Computations. February 1995.
Current research in parallel programming is focused on closing the gap between globally indexed algorithms and the separate address spaces of processors on distributed memory multicomputers. A set of index translation schemes have been implemented as a part of CHAOS runtime support library, so that the library functions can be used for implementing a global indez space across a collection of separate local index spaces. These schemes include also software-cached translation schemes aimed at adaptive irregular problems as teen as a distributed translation table technique for statically irregular problems. To evaluate and demonstrate the efficiency of the softwDare-cached translation schemes, experiments have been performed with an adaptively irregular loop kernel and a full-fledped 3D DSMC code from NASA Langley on the Intel Paragon and Cray T3D. This paper also discusses and analyzes the operational conditions under which each scheme can produce optimal performance. (Also cross-referenced as UMIACS-TR-95-28) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Bongki Moon. Joel Saltz. Adaptive Runtime Support for Direct Simulation Monte Carlo. February 1995.
In highly adaptive irregular problems such as many Particle-In-Cell (PICJ codes and Dimet Simulation Monte Carlo (DSMCJ codes, data access patterns may vary from time step to time step. This fluctuation may hinder efficient utilization of distributed memory parallel computers because of the resulting overhead for data redistribution and dynamic load balancing. To efficiently parallelize such adaptive irregular problems on distributed memory parallel computers, several issues such as effective methods for domain partitioning and fast data transportation must be addressed. This paper presents efficient runtime support methods for such problems. A simple one-dimensional domain partitioning method is implemented and compared with unstructured mesh partitioners such as recursive coordinate bisection and recursive inertial bisection. A remapping decision policy has been investigated for dynamic load balancing on S-dimensional DSMC codes. Performance results are presented (Also cross-referenced as UMIACS-TR-95-27) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Parallel Monte Carlo Simulation of Three-Dimensional Flow over a Flat. Robert P. Nance. Hassan Fallah-Adl. Richard G. Wilmoth. Bongki Moon. Joel Saltz. February 1995.
This paper describes a parallel implementation of the direct simulation Monte Carlo method. Runtime library support is used for scheduling and execution of communication between nodes, and domain decomposition is performed dynamically to maintain a favorable load balance. Performance tests are conducted using the code to evaluate various remapping and remappinginterval policies, and it is shown that a one-dimensional chain-partitioning method works best for the problems considered. The parallel code is then used to simulate the Mach 20 nitrogen JYow over a finite-thickness fiat plate. It will be shown that the parallel algorithm produces results which are very similar to previous DSMC results, despite the increased resolution available. However, it yields significantly faster execution times than the scalar code, as well as very good load-balance and scalability characteristics. (Also cross-referenced as UMIACS-TR-95-25) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Run-time and Compile-time Support for Adaptive Irregular Problems. Shamik D. Sharma. Ravi Ponnusamy. Bongki Moon. Yuan-Shin Hwang. Raja Das. Joel Saltz. May 1994.
In adaptive irregular problems the data arrays are accessed via indirection arrays, and data access patterns change during computation. Implementing such problems on distributed memory machines requires support for dynamic data partitioning, efficient preprocessing and fast data migration. This research presents efficient runtime primitives for such problems. This new set of primitives is part of the CHAOS library. It subsumes the previous PARTI library which targeted only static irregular problems. To demonstrate the efficacy of the runtime support, two real adaptive irregular applications have been parallelized using CHAOS primitives: a molecular dynamics code (CHARMM) and a particle-in-cell code (DSMC). The paper also proposes extensions to Fortran D which can allow compilers to generate more efficient code for adaptive problems. These language extensions have been implemented in the Syracuse Fortran 90D/HPF prototype compiler. The performance of the compiler parallelized codes is compared with the hand parallelized versions. (Also cross-referenced as UMIACS-TR-94-55) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Last Generated Fri Aug 11 04:01:01 EDT 2000