ACM Home  

Tutorials & Workshops - ACM SIGMETRICS 2000

Storage Systems Management

Authors
Guillermo Alvarez, Kim Keeton, Arif Merchant, Erik Riedel, John Wilkes

Storage Systems Program
Computer Systems and Technology Laboratory
Hewlett-Packard Laboratories
 

Abstract
Storage systems are becoming larger, more complex, and more difficult to manage each year. An enterprise-scale computer installation can contain dozens of hosts and tens or even hundreds of disk arrays, connected via storage area network (SAN) fabrics and easily encompassing thousands of disks and logical volumes. Total capacities of tens of terabytes are becoming commonplace. Even smaller installations may include storage devices from multiple vendors and different performance levels that must be managed together. In addition, storage systems must provide guaranteed minimum performance and dependability, even in the presence of device failures, upgrades and changes in application requirements.

Designing and maintaining large storage systems to meet such requirements is difficult because of the enormous number of configuration choices available in storage hardware and the complexity and variety of application workloads. Current approaches for managing this complexity are more art than science, requiring human intervention and depending on decisions based on experience, intuition and guesswork. This frequently leads to solutions that are grossly over-provisioned or substantially under-performing or both.

This tutorial surveys the issues in storage management, the available tools for tackling them today and the emerging research on how to solve them as complexity continues to increase. We will briefly describe current storage technologies, including disks, disk arrays, storage area networks (SANs) and network-attached storage (NAS). We will outline the variety of available storage management tools and provide an introduction to performance, availability and workload characterization as basic issues. Finally, we will present our proposal for an attribute-based, goal-directed, self-managing storage system that handles much of the complexity of system management automatically and transparently.
 

Who should attend?
The tutorial is directed towards an audience with a general computer systems background; no specialized storage systems knowledge is assumed, although a basic operating systems background will be useful. Although this work relies on a number of statistical, performance modeling and measurement issues, we will not delve into technical details.
 
Biographies
Guillermo Alvarez is Project Scientist in the Storage Systems Program at Hewlett-Packard Laboratories in Palo Alto, California. He joined HP Labs in 1998, after receiving his MS and PhD in Computer Science from the University of California, San Diego. He has co-authored more than 20 papers in the areas of RAIDs, fault-tolerant distributed systems, and parallel computing. He has been named Fellow of the Organization of American States. His current research interests include the design and performability evaluation of high-performance storage architectures.

Kimberly Keeton is a Researcher in the Storage Systems Program at Hewlett-Packard Laboratories. Her current research interests include the design and evaluation of large-scale storage systems and workload characterization and modeling. She has also worked in the areas of intelligent storage devices, processor and memory system evaluation, databases and network protocol evaluation. Before joining HP Labs in 1999, she received her PhD and MS in Computer Science from the University of California, Berkeley, and her BS in Computer Engineering and Engineering and Public Policy from Carnegie Mellon University.

Arif Merchant is Project Scientist in the Storage Systems Program at Hewlett-Packard Laboratories. His research interests include the design and modeling of storage systems. He received a B. Tech. from IIT, Bombay and a PhD in Computer Science from Stanford University. He worked at the IBM Watson Research Center in Hawthorne, NY, and the NEC Computer and Communications Research Laboratories in Princeton, NJ, before joining HP Labs in 1995.

Erik Riedel is a Researcher in the Storage Systems Program at Hewlett-Packard Laboratories. His research interests include distributed storage systems, workload characterization, and the design of new protocols for accessing storage. He joined HP Labs in 1999, after receiving a BS in Computer Science, an MSE in Software Engineering, and a PhD in Computer Engineering from Carnegie Mellon University. His previous work includes parallel file systems, network-attached storage, databases, and Active Disks that map application processing directly to storage devices.

John Wilkes is Director of the Storage Systems Program at Hewlett-Packard Laboratories. His main research interest is in the design and management of fast, highly available, distributed-storage systems. He has also dabbled in network architectures (the Hamlyn sender-based message model), OS design (most recently in the Brevix project), and in learning about Gothic and early Renaissance art and architecture. He earned a BA and MA in physics and a Diploma and PhD in computer science from the University of Cambridge. He has been at Hewlett-Packard Labs since 1982, where he is now a Laboratory Scientist.
 


[Last updated Fri Apr 14 2000]

Web Accessibility