Principles of Data Mining

Spring 2002

Course Description:
This course covers the fundamentals of data mining from both a statistical and database perspective. The course has three major sections. The first section of the course will cover the statistical and machine learning foundations for data mining. In the second section of the course, we will cover the fundamental data mining concepts and algorithms for tasks such as OLAP, association rules, clustering, etc. The final section of the course will focus on research areas such as text mining, collaborative filtering, link analysis and mining in biological domains (as time permits).

Time and Place: Tu, Th 11:00 - 12:15 CLDB0109

Lise Getoor
AVW 3205
office hours: Tu 3:30-5:30PM and by appt.

Teaching Assistant:
Eiman Elnahrawy
AVW 3228
office hours: Tu 12:30-1:30PM, Th 9:30-10:30AM and by appt.


CMSC828G can count toward either the AI or DB (not both) PhD qualifying coursework. It can only be used as a DB qualifying course if the other course taken is CMSC724.

Prerequisites: CMSC421, Introduction to Artificial Intelligence and CMSC424, Database Design, or equivalent courses.

Workload: There will be an in-class midterm and final. There will be four homework assignments. A major component of the workload will be a class project.

Grading: Midterm (20%), Final (30%), Project (35%), HW (15%).

Text: Principles of Data Mining, by David Hand, Heikki Mannila and Padhraic Smyth. MIT Press, 2001.

Mailing list: To subscribe to the mailing list, send a message to with 'subscribe cmsc828g' in the body of your email.




Project Description

Some additional resources

Web Accessibility