<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="http://www.cs.umd.edu/projects/hpsl/chaos/ref/project.xsl"?>
<project>
<title>Generic Multidimensional Scientific Indexing Library</title>
<participants>
<name>Alan Sussman, Ph.D.</name>
<name>Beomseok Nam, M.S.</name>
</participants>
<distribution href="dist/">GMIT trial version (includes GMIL)</distribution>
<references topic="25">
<refitem href="html/index.html">
<title>Programmer's Reference</title>
</refitem>
<refitem href="manual.html">
<title>GMIT Step-by-Step Manual</title>
</refitem>
</references>
<description>
<p>Applications that query into very large multidimensional datasets are becoming more common. Many self-describing scientific data file formats have also emerged, which have structural metadata to help navigate the multidimensional arrays that are stored in the files. The files may also contain application specific semantic metadata. Our Generic Multidimensional Scientific Indexing Library (GMIL) enables us to perform efficient searches for subsets of multidimensional datasets, using semantic information to build multidimensional indexes, and grouping items into properly sized chunks to maximize disk I/O bandwidth. More information about data chunking can be found in the paper <a href="ftp://ftp.cs.umd.edu/pub/hpsl/papers/papers-pdf/ccgrid2003-tech.pdf">Improving Access to Multidimensional Self-describing Scientific Datasets</a>.</p>
<p>The most common type of retrieval pattern into multidimensional datasets is a spatial range query, which reads a contiguous subset of one or more multi-dimensional arrays within the given query range.  Using spatial indexing techniques, such as R*-trees or SH-trees, our indexing library allows for direct access to subsets of a dataset.</p>
<p align="center"><img src="hdfeos.jpg" height="249"/><br/>Figure 1. HDF-EOS AVHRR Level 1B Dataset Representation</p>
<p>Currently GMIL supports a few self-describing scientific data formats such as netCDF, HDF4 and HDF5, which contain structural metadata. Additional scientific data formats such as SILO or Chombo mesh data formats will be supported in the future releases. GMIL has an index creation module, an index search module, a resolution interpolation module, and a filtering module to facilitate performing range queries.  Any new self-describing scientific data format can utilize GMIL by adding small amount of codes on top of it, as long as the indexing semantic is supported by GMIL.  At the moment, GMIL supports AVHRR Level1B style indexing semantics such as HDF-EOS. In the futre releases, we plan to add as many indexing semantics as we can.</p>
<p align="center"><a href="gmit1.jpg"><img border="0" src="gmit1.jpg" height="219"/></a> <a href="gmit2.jpg"><img border="0" src="gmit2.jpg" height="219"/></a><br/>GMIT Version 1.0</p>
<p>The Generic Multidimensional Indexing Tool (GMIT) is a GUI tool for GMIL.</p>
</description>
</project>
