Storage systems are becoming larger, more complex, and more difficult
to manage each year. An enterprise-scale computer installation can
contain dozens of hosts and tens or even hundreds of disk arrays,
connected via storage area network (SAN) fabrics and easily
encompassing thousands of disks and logical volumes. Total capacities
of tens of terabytes are becoming commonplace. Even smaller
installations may include storage devices from multiple vendors and
different performance levels that must be managed together. In
addition, storage systems must provide guaranteed minimum performance
and dependability, even in the presence of device failures, upgrades
and changes in application requirements.
Designing and maintaining large storage systems to meet such
requirements is difficult because of the enormous number of
configuration choices available in storage hardware and the complexity
and variety of application workloads. Current approaches for managing
this complexity are more art than science, requiring human
intervention and depending on decisions based on experience, intuition
and guesswork. This frequently leads to solutions that are grossly
over-provisioned or substantially under-performing or both.
This tutorial surveys the issues in storage management, the available
tools for tackling them today and the emerging research on how to
solve them as complexity continues to increase. We will briefly
describe current storage technologies, including disks, disk arrays,
storage area networks (SANs) and network-attached storage (NAS). We
will outline the variety of available storage management tools and
provide an introduction to performance, availability and workload
characterization as basic issues. Finally, we will present our
proposal for an attribute-based, goal-directed, self-managing storage
system that handles much of the complexity of system management
automatically and transparently.
|
Guillermo Alvarez is Project Scientist in the Storage Systems Program
at Hewlett-Packard Laboratories in Palo Alto, California. He joined
HP Labs in 1998, after receiving his MS and PhD in Computer Science
from the University of California, San Diego. He has co-authored
more than 20 papers in the areas of RAIDs, fault-tolerant distributed
systems, and parallel computing. He has been named Fellow of the
Organization of American States. His current research interests
include the design and performability evaluation of high-performance
storage architectures.
Kimberly Keeton is a Researcher in the Storage Systems Program at
Hewlett-Packard Laboratories. Her current research interests include
the design and evaluation of large-scale storage systems and workload
characterization and modeling. She has also worked in the areas of
intelligent storage devices, processor and memory system evaluation,
databases and network protocol evaluation. Before joining HP
Labs in 1999, she received her PhD and MS in Computer Science from the
University of California, Berkeley, and her BS in Computer
Engineering and Engineering and Public Policy from Carnegie Mellon
University.
Arif Merchant is Project Scientist in the Storage Systems Program at
Hewlett-Packard Laboratories. His research interests include the
design and modeling of storage systems. He received a B. Tech. from
IIT, Bombay and a PhD in Computer Science from Stanford University.
He worked at the IBM Watson Research Center in Hawthorne, NY, and
the NEC Computer and Communications Research Laboratories in
Princeton, NJ, before joining HP Labs in 1995.
Erik Riedel is a Researcher in the Storage Systems Program at
Hewlett-Packard Laboratories. His research interests include
distributed storage systems, workload characterization, and the design
of new protocols for accessing storage. He joined HP Labs in 1999,
after receiving a BS in Computer Science, an MSE in Software
Engineering, and a PhD in Computer Engineering from Carnegie Mellon
University. His previous work includes parallel file systems,
network-attached storage, databases, and Active Disks that map
application processing directly to storage devices.
John Wilkes is Director of the Storage Systems Program at
Hewlett-Packard Laboratories. His main research interest is in the
design and management of fast, highly available, distributed-storage
systems. He has also dabbled in network architectures (the Hamlyn
sender-based message model), OS design (most recently in the Brevix
project), and in learning about Gothic and early Renaissance art and
architecture. He earned a BA and MA in physics and a Diploma and PhD
in computer science from the University of Cambridge. He has been at
Hewlett-Packard Labs since 1982, where he is now a Laboratory
Scientist.
|