spacer
spacer spacer spacer spacer spacer spacer spacer spacer spacer
spacer spacerPublic home pagespacer spacer spacerLocal home pagespacer spacer spacerHow to contact usspacer spacer spacerSearchspacer spacer
spacer spacer spacer spacer spacer spacer spacer spacer spacer
spacer
spacer

Catalog Description

Study of techniques for building traditional, relational database systems. This course focuses on performance and reliability considerations and highlights the interdependencies among the choices facing the system implementor. Topics include: Database Management System Architecture, Disk and Memory Management, Access Paths and Indexes, Concurrency Control, Crash Recovery, Query Execution, Query Optimization, and Benchmarking. A semester-long project involves constructing a small relational database system that incorporates many of the techniques studied.

Objective

A fairly standard set of techniques has been developed for implementing high-performance database systems. These techniques include: B+trees, two-phase locking, write-ahead logging, and dynamic programming-based query optimization. As in any complex system, The basic techniques often interact in subtle ways; the design choices made at one level of the system can place constraints on other decisions that may not be immediately apparent. The purpose of this course is to investigate these traditional techniques and their interactions through the implementation of a scaled-down relational database system and through the study of some classic papers. This course focuses almost exclusively on the implementation of relational database systems in a centralized environment. However, the techniques covered are essential building blocks for distributed, parallel or object-oriented database systems.

Prerequisites

CMSC 424 or permission of instructor.

Topics

  1. Introduction to database systems (1 week)

    Goals and functions of DBMS, transactions, reference architecture for a relational DBMS, etc.

  2. The ``Roots'' (1 week)

    Overviews of some classic systems: System R, INGRES, DB2.

  3. Architectural foundations (2 weeks)

    Characteristics of hardware and operating systems impacting the design of a DBMS.

  4. Buffer management (1 week)

    Memory management of multi-user systems, DBMin algorithm, implications of transaction semantics.

  5. Access paths and indexes (1 week)

    Structures that are optimized for disk-resident data, e.g., B+trees and linear hashing

  6. Concurrency control (2 weeks)

    Locking techniques, lock manager implementation, comparison of pessimistic and optimistic techniques, concurrent access to search structures, deadlock handling.

  7. Logging and crash recovery (2 weeks)

    Write-ahead logging, the ARIES method, shadow-based techniques, media failures.

  8. Query processing (3 weeks)

    Access path selection, join methods, optimization techniques, sub-query and view processing.

  9. Benchmarking and performance (1 week)

    TPC and Wisconsin benchmarks, performance measurement and tuning.

  10. Transaction management (1 week)

    Distributed commit protocols and transaction monitors.

Course Text

M. Stonebraker, Readings on Database Systems, Second Edition, Morgan Kauffman, 1994.

Typical Grading and Workload