CMSC454, Algorithms for Data Science, Spring 2022

Instructor: Aravind Srinivasan
Class Venue and Time: IRB 1116, TuTh 11AM - 12:15PM

tl;dr -- Students: please add yourselves to the Piazza page for the class! We will use Piazza extensively.

Administrative Details

Instructor: Aravind Srinivasan
Office: IRB 4164, Phone: 301-405-2695
Instructor's Office Hours: By Zoom Mon, Wed 9-10AM (Zoom link will be shared on Piazza), or by appointment (please email Aravind)
TA: Haoan Feng (hfengac AT umd).
TA Office Hours: TBA.
Course Time and Location: Tue, Thu 11-12:15, IRB 1116
Course Webpage: https://www.cs.umd.edu/class/spring2022/cmsc454.

Textbook, Lecture Notes and Related Resources

There is no required text. We will have a good deal of overlap with (but will not be identical to) Prof. Cameron Musco's class.

We will also use two free online books as references for part of the class:
Foundations of Data Science by Avrim Blum, John Hopcroft, and Ravi Kannan; and
Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, and Jeff Ullman.

Aravind's Pledge to the Students

Your education is very important to me, and I respect each of you regardless of how you do in the class. My expectations of you are that you attend class and pay full attention, and give enough time to the course. I strongly encourage you to ask questions in class, and to come to the office hours (mine or the TA's) with any further questions. We can have a very enjoyable educational experience if you pay attention in class, give sufficient time to our course, and bring any difficulties you have promptly to our attention. I look forward to our interaction both inside and outside the classroom.

Course Overview

This course presents some fundamental algorithmic approaches in data science. The approach will be mathematical/algorithmic and proof-based, but attention will be drawn regularly to real-world data-science concerns that motivate our problems and approaches.

Topics include: probability review; data verification on the cloud; hashing, concentration bounds, and Bloom filters; sketching and streaming; Nelson-Yu approximate counting; high-dimensional geometry; dimension reduction and the Johnson-Lindenstrauss Lemma; linear-algebra review; PCA; low-rank approximation; SVD; spectral clustering; gradient descent and its relatives; differential privacy; fairness in data science, and related topics.

Grading, Homework, and Exams

There will be a combination of written homeworks (about six), a midterm exam, and a comprehensive final exam.

All homework is to be done individually, but discussions with classmates are encouraged: just list the classmates you discussed the assignment with.

Weight of assignments: homework (45% total), midterm exam (20%), and final exam (35%).

The mid-term will be 11AM-12:15PM on Thursday, March 17th; the final exam 8-10AM on Thursday, May 12. Both will be in class.

Unless otherwise noted, the following late policy shall be applied to all homework:

The lowest homework score will be dropped.

Course Evaluation

Students are strongly encouraged to complete their course evaluations; please do so at the CourseEvalUM website when it is ready.

Academic Accommodations for Disabilities

Any student eligible for and requesting reasonable academic accommodations due to a disability is requested to provide, to the instructor in class or by email, a letter of accommodation from the Office of Disability Support Services (DSS) within the first two weeks of the semester.

COVID-19--related issues, Excused Absences, Academic Integrity, and Additional Info.

President Pines provided clear expectations to the University about the wearing of masks for students, faculty, and staff. Face coverings over the nose and mouth are required while you are indoors at all times. KN95 masks are required in all classroom settings and recommended everywhere. There are no exceptions when it comes to classrooms, laboratories, and campus offices. Students not wearing a mask will be given a warning and asked to wear one, or will be asked to leave the room immediately. Students who have additional issues with the mask expectation after a first warning will be referred to the Office of Student Conduct for failure to comply with a directive of University officials.

Please see the Masking mandate at https://umd.edu/virusinfo/emails/080621.

For COVID-19--related Disability Accommodations & Requests for Consideration, see https://umd.edu/virusinfo/emails/063021-2.

The policy on excused absences is at https://www.ugst.umd.edu/V-1.00(G).html.

Please see the university's policies on various important issues.

Web Accessibility