For each of the required papers, a critique must be submitted by 1:30pm the day of the class. More below.|
- Aug 30: Introduction/Overview [pdf slides]
- Sept 1: Probabilistic Databases[pdf slides]
- Sept 8: Probabilistic Databases; Probabilistic Graphical Models
- Sept 13: Probabilistic Databases
- Sept 15: Probabilistic Databases
- Sept 20: No Class
- I strongly recommend attending William Cohen's talk instead -- the topic is closely related to graph analysis and databases.
- Sept 22: Probabilistic Databases[pdf slides]
- Sept 27: Ranking; Information Retrieval[pdf slides]
- Sept 29: No Class; MAKEUP CLASS ON Oct 1, 12:30pm
- Oct 6: Graph Databases
- Oct 11: Graph Databases
- Oct 13: Graph Databases[pdf slides]
- Oct 18: Attend Dan Suciu's Talk
In 1115 CSIC, at 4pm. You are required to submit a summary after the talk. The talk is largely based on the paper below.
- Oct 20: Graph Databases[pdf slides]
- Oct 25: Graph Databases[pdf slides]
- Nov 1: Graph Databases[pdf slides]
- Nov 3: Graph Databases (slides above)
- Thursday, Nov 4: Attend Val Tannen's Talk
- Nov 8: Mapreduce/Parallel Databases [pdf slides]
(summary required) MapReduce: Simplified Data Processing on Large Clusters; Jeffrey Dean and Sanjay Ghemawat; OSDI'04: Sixth Symposium on Operating System Design and Implementation
- (summary required) MapReduce: A major step backwards; DeWitt and Stonebraker; 2008
[blog post 1]
[blog post 2]
- Nov 10: Mapreduce/Parallel Databases
- Nov 15: Mapreduce/Graph Analytics
- Nov 17: Mapreduce/Graph Analytics
- Nov 22: Mapreduce/Graph Analytics
- Nov 29: Graph Analytics
- Dec 1: Graph Analytics
- Dec 6: Project Presentations
- Dec 8: Project Presentations
- Tentative Reading List: Large Scale (Graph) Analytics
Tentative reading list
This is a list of some the relevant papers for each of the broad topics. We will choose a subset of these for the class.
Probabilistic databases; Scalable probabilistic models and inference
Large-scale Analytics in the Cloud
This is a research-oriented class and hence the main work in this class is independently
reading and evaluating research papers in the field of databases. For each of the assigned
papers, you should submit a critique before the class. The critiques should show
evidence of independent thinking, and there are many ways you could structure those.
Here are two suggestions:
I will post examples of some summaries after the first paper (or you can look at some examples
from last year here).
- A short summary (4-5 lines), followed by 3 strong points of the paper (things you
liked about it) and 3 weak points of the paper.
- A short summary (4-5 lines), followed by 3 questions about the content of the paper.
The critique should be posted on the class forum in the thread corresponding to the paper.
The forum is set to be a private forum, so you must join the group "CMSC 724 Spring 2009" before you can post in it.
To join the group, follow the instructions at: Joining a Group.
Critiques are worth about 20% (along with class participation).
Late submissions or no submissions will be penalized. Missing upto
two summaries is fine. Beyond that: 3 missed or late submissions - 5/20 points deducted. 4 missed submissions, 10/20 deducted.
5 missed submissions 20/20 deducted.