Overview

A tremendous amount of data is generated every day that needs to be stored, so that it can be easily accessed by people interested in information about certain topics, movie videos, or data for scientic research etc. Advances in almost every eld are now possible due to the availability of data for research, whether it be data about weather, DNA sequences, sales information data for inventory management and marketing, or data collected by satellites.

Most of this data is stored on disk arrays that are located in a centralized data center. These data centers are packing together racks and racks of equipment and are typically located where power is cheap since they use a tremendous amount of energy. In this project we focus on storage centric data centers containing thousands of distributed disks which store large quantities of data. Just as regular datacenters, thermal management in such large scale storage systems is a major problem. Heavy workload processing causes the disks to heat up leading to loss of reliability. This signicantly impacts cooling cost as well. In this project we are investigating techniques for thermal management in large scale storage systems comprising of thousands of storage devices and processing millions of data requests per day. Although a signicant amount of work exists for thermal and energy management in datacenters, temperature considerations in large scale storage systems have received little attention.

We focus on thermal and energy management questions by considering both how workload affects cooling costs and how temperature rise affects reliability. The main objective is to develop models and algorithms to optimize workload to minimize energy usage with a focus being to manage temperature spikes by adjusting workload distribution. In addition, we are interested in techniques that will permit a shut down or slow down of a signicant fraction of the system in times of lower than peak demand for data.

Applications

Disk access and usage patterns vary quite a bit across different applications. Our goal is to develop a general science of thermal management, that will guide us in data placement, data migration, data replication, data access task distribution, scheduling and control of disk speeds, for any large array of storage devices. Due to fluctuating workload, there is a tremendous potential to save energy and reduce thermal hotspots by shutting down or slowing down a signicant fraction of the disk system. In addition, by balancing workloads, we may be able to eliminate hot spots. The following applications would serve as a guide while developing such a thermal management scheme.

Streaming Video

In recent years companies such Netflix have made streaming video a reality, and the demand for this is expected to grow many-fold over the next few years. The typical data access pattern here may be composed of very large sequential reads, that each occupy a single disk for a period of time (maybe a few hours).

Large-scale Data Analysis

We consider two fairly similar application domains here: (1) Very large databases (e.g. scientic federations like Skyquery, genome databases) where the typical access pattern includes a set of user queries each of which may involve accessing a set of relations; (2) Large scale analytics made popular in recent years by the map-reduce framework, where the typical access pattern is similar (each user query accesses a set of files), with a major exception being that replication is typically built into the framework for performance and fault tolerance.

Transaction Processing

Here we may have a large database that needs to support possibly hundreds of thousands of transactions per second. The key difference from the above two application domains is that, the data accesses here are for small ??records?? and typically random. This also captures the behavior of many online Web services or websites like de.li.cious, Facebook etc. These applications span a range of access patterns, and we will use them to make our optimization problems concrete.

Publications

Acknowledgement

 This material is based upon work supported in part by the National Science Foundation grants CCF-0728839 and 0937865.

Who are we?

Faculty

Graduate Students