CMSC724: Database Management Systems
Prof. Amol Deshpande
; CSIC 1121; Mon-Wed 3:30pm-4:45pm
Schedule and Readings
Jan 25: Introduction/Overview
For a nice historical overview, see the first paper ("Evolution of ..") in this
ACM Computing Surveys, Mach 1976
Jan 30, Feb 1, Feb 6: Background
[slides data models]
[normalization overview slides (spring 2011)]
(Jan 30) "What goes around comes around"; Mike Stonebraker and Joe Hellerstein; Redbook.
(Feb 1) "Architecture of a Database System"; Joe Hellerstein, Mike Stonebraker, James Hamilton; Foundations and Trends 2007. (
crop-merged Version of that PDF
(Feb 1) Concurrency Control and Recovery; Mike Franklin, 1997
(Feb 6) Joachim W. Schmidt. Some High Level Language Constructs for Data of Type Relation. ACM Transactions on Database Systems, 2(3), 1977, 247-261.
Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170 (1993)
Database System Concepts; Avi Silberschatz, Henry F. Korth, S. Sudarshan. Two Appendixes covering network model and hierarchical model in detail are available on the book webpage.
The Relational vs Network Models Debate
Is your database relational ? Ted Codd
Feb 8, Feb 13, Feb 15, Feb 20: Large-scale Data Analysis Systems
Overview Chapter in Redbook by Bailis
(Feb 8) Parallel database systems: the future of high performance database systems; DeWitt, Gray; CACM 1992
(Feb 8) Stonebraker et al. C-store: A Column-oriented DBMS. SIGMOD, 2005.
(Feb 13) MapReduce: A Flexible Data Processing Tool; Jeffrey Dean and Sanjay Ghemawat; CACM 2010
(Feb 13) MapReduce and Parallel DBMSs: Friends or Foes? Stonebraker et al.; CACM 2010
(Feb 15) Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. OSDI, 2008.
(Feb 20) Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing; Zaharia et al.; NSDI 2012
(Feb 20) Spark SQL: Relational data processing in Spark; SIGMOD 2015.
Feb 22, Feb 27, Mar 1, Mar 6, Mar 8, Mar 13: Query Processing and Query Optimization
Overview Chapter in the Redbook by Hellerstein
Surajit Chaudhuri: An Overview of Query Optimization in Relational Systems. PODS 1998: 34-43;
(Feb 22) Access path selection in a relational database management system; Selinger et al.; SIGMOD 1979
(Feb 27) Selectivity estimation without the attribute value independence assumption; Poosala and Ioannidis; VLDB 1997
(Feb 27) CORDS: automatic discovery of correlations and soft functional dependencies; SIGMOD 2004
(Mar 1) Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited; VLDB 2013.
(Mar 6) Column-Stores vs. Row-Stores: How Different Are They Really?; SIGMOD 2008.
(Mar 8) Skew Strikes Back: New Developments in the Theory of Join Algorithms; SIGMOD Record 2013
(Mar 13) Ron Avnur and Joseph M. Hellerstein. Eddies: Continuously Adaptive Query Processing. SIGMOD, 2000.
(Mar 13) Volker Markl, Vijayshankar Raman, David Simmen, Guy Lohman, Hamid Pirahesh, Miso Cilimdzic. Robust Query Processing Through Progressive Optimization. SIGMOD, 2004.
Mar 15: Buffer/TBA
Mar 27, Mar 29, Apr 3, Apr 5: Other DBMS Models/Architectures
(Mar 27) Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Ake Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, Mike Zwilling. Hekaton: SQL Server's Memory-optimized OLTP Engine. SIGMOD, 2013.
(Mar 27) Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, Michael Stonebraker. OLTP Through the Looking Glass, and What We Found There. SIGMOD, 2008.
(Mar 29) RDF-3X: a RISC-style Engine for RDF; VLDB 2008
[link to pdf]
(Apr 3) Relational Databases for Querying XML Documents: Limitations and Opportunities; Jayavel Shanmugasundaram et al.; VLDB 1999
(Apr 5) GraphX: Graph Processing in a Distributed Dataflow Framework; OSDI 2014
(Apr 5) SQLGraph: An Efficient Relational-Based Property Graph Store; SIGMOD 2015
Apr 10, Apr 12, Apr 17: Data Streams/Dataflow Engines
(Apr 10) Continuous queries over data streams; Babu, Widom; SIGMOD Record 2001
[link to pdf]
(Apr 12) Maintenance of materialized views: Problems, techniques, and applications; A Gupta, IS Mumick - IEEE Data Eng. Bull., 1995
(Apr 17) The Design of the Borealis Stream Processing Engine; Abadi et al.; CIDR 2005
(Apr 17) Discretized streams: fault-tolerant streaming computation at scale; SOSP 2013
Apr 24, Apr 26, May 1, May 3, May 8, May 10: Complex and Interactive Analytics
(Apr 24) Implementing data cubes efficiently; Harinarayanan et al.; SIGMOD 1996.
(Apr 24) Yihong Zhao, Prasad M. Deshpande, Jeffrey F. Naughton. An Array-Based Algorithm for Simultaneous Multidimensional Aggregates. SIGMOD, 1997.
(Apr 26) Differential Dataflow; CIDR 2013.
(May 1) Interactive data analysis: the Control project; IEEE Computer 1999
(May 3) Dremel: Interactive Analysis of Web-Scale Datasets; VLDB 2010
(May 3) BlinkDB; EuroSys 2013
[ACM DL Link]
(May 8) Towards a unified architecture for in-RDBMS analytics; Feng et al.; SIGMOD 2012
[link to pdf]
(May 10) Distributed GraphLab: a framework for machine learning and data mining in the cloud; VLDB 2012
[link to pdf]
May 18, 10:30am-12:30pm:
Scheduled Final Exam for the Class