<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="http://www.cs.umd.edu/projects/hpsl/chaos/ref/project.xsl"?>
<project>
<title>Multiple Query Optimization</title>
<participants>
<name>Henrique Andrade, Ph.D.</name>
<name>Joel Saltz, M.D., Ph.D.</name>
<name>Alan Sussman, Ph.D.</name>
<name>Tahsin Kurc, Ph.D.</name>
</participants>
<references topic="23">
</references>
<description>
<p>Data analysis is increasingly employed in collaborative environments, where multiple clients access the same datasets and perform similar processing on the data. For instance, in medical imaging, a possible scenario is a large group of students want to simultaneously explore the same set of digitized microscopy slides or visualize the same Magnetic Resonance Imaging (MRI) and Computerized Tomography (CT) results. In these situations, a data server needs to execute multiple queries simultaneously to minimize latencies to the clients.</p>
<p>Multiple query optimization (MQO) can be defined as the set of techniques aimed at minimizing the total cost of processing a set of queries by creating an optimized access plan for the entire set of queries.</p>
<p>In this line of work, we investigate the problem of optimizing multiple data analysis queries for computation-intensive and data-intensive applications. Many different aspects of the multiple query optimization problem have been studied in other contexts, particularly in relational databases. However, in the context of scientific data analysis applications, the scale of the datasets, the application-specific nature of the data structures, and the computation of user-defined aggregates require new optimization techniques to ensure good system performance, especially under heavy query workloads.</p>
<p>To study the MQO issues, we have designed and implemented a full-fledge and extensible data server, and, currently, have three real-world applications built on top of it: a telepathology visualization application (<a href="http://www.cs.umd.edu/projects/hpsl/chaos/ResearchAreas/vm/">Virtual Microscope</a>), a Volumetric Reconstruction application (that builds a 3D view of an object obtained from multiple continuous camera shots), and a remote sensing application (that evaluates queries over <a href="http://edc.usgs.gov/glis/hyper/guide/avhrr">Advanced Very High Resolution Radiometer (AVHRR)</a> datasets). The data server has around 80,000 lines of code, and works both on SMP machines, as well as on clusters of PCs. There is also a version customized for widely distributed grid environments.</p>
</description>
</project>
