Course Overview

Course Description:The future of Artificial Intelligence demands a paradigm shift towards multimodal perception, enabling systems to interpret and integrate information from diverse modalities such as vision, audio, language, touch, and beyond. Multimodal Deep Learning is a dynamic and interdisciplinary field with applications in areas like multimedia analysis, autonomous driving, healthcare, virtual assistants, robotics, and more. This course will provide a comprehensive introduction to the fundamental concepts of key modalities and algorithms for multimodal representation learning, alignment, and fusion. Students will explore a range of multimodal and cross-modal applications while gaining hands-on experience in implementing state-of-the-art multimodal deep learning models. Along the way, we will also examine cutting-edge research and emerging trends in this vibrant and rapidly evolving field.

Logistics

Instructor: Ruohan Gao (rhgao [at] umd.edu)

Office: IRB-4248

Office hours: Thursday 2:00PM - 3:00PM


TA: Zihao Wei (zihaowei [at] umd.edu)

TA Station: AVW-4424

Office hours: Monday 3:00PM - 4:00PM


TA: Derong Jin (djin77 [at] umd.edu)

TA Station: AVW-4424

Office hours: Friday 1:00PM – 2:00PM


  • Lectures: Tuesday/Thursday 12:30PM - 1:45PM Eastern Time at IRB 2107.
  • Piazza: We will be using Piazza as the primary platform for communication.
  • Gradescope: Submit your assignments on Gradescope.
  • Canvas: Submit your project proposal, project milestone, and final report on Canvas.

Course Requirements

Course Prerequisites: Minimum grade of C- in CMSC320, CMSC330, and CMSC351; and 1 course with a minimum grade of C- from (MATH240, MATH341, MATH461). Each student is expected to be proficient in Python programming and familiar with basic linear algebra, probability, and multivariable calculus.

Requirements Summary:
  • Assignments: three assignments which will improve both your theoretical understanding and your practical skills throughout the course. All assignments will contain programming parts and written questions.
  • Midterm Exam: an in-class midterm exam on lecture contents. Detailed information to be made available as an announcement on Piazza closer to the exam date.
  • Final Project: completing a research-oriented final project with one or two partners for you to apply what you have learned in class to a problem of your interest. More details in lecture slides from week 1.

Grading Summary:
  • 45% Assignments (three assignments that contain both coding parts and written questions)
  • 20% In-Class Midterm Exam
  • 35% Final Project (including project proposal, project milestone, final report, and presentation)
  • 3% Extra Credit (class participation, exceptional final project, exceptional presentation, etc.)

Important Dates

  • Friday, Sept 19: assignment 1 released.
  • Friday, Oct 3: project proposal due at 11:59pm ET.
  • Monday, Oct 6: assignment 1 due at 11:59pm ET.
  • Friday, Oct 10: assignment 2 released.
  • Friday, Oct 23: in-class midterm exam.
  • Monday, Nov 3: assignment 2 due at 11:59pm ET.
  • Friday, Nov 7: assignment 3 released.
  • Friday, Nov 21: project milestone due at 11:59pm ET.
  • Monday, Dec 1: assignment 3 due at 11:59pm ET.
  • Friday, Dec 12: final project report due at 11:59pm ET.

Schedule

Policy

Late Policy:
  • All students have 4 free late days (to be used in 24-hour blocks) for the course.
  • You may use up to 2 late days per assignment with no penalty.
  • You may not use late days for the final project report.
  • Once you have exhausted your free late days, we will deduct a late penalty of 25% per additional late day.

  • Academic Integrity: Note that academic dishonesty includes not only cheating, fabrication, and plagiarism, but also includes helping other students commit acts of academic dishonesty by allowing them to obtain copies of your work. In short, all submitted work must be your own. Cases of academic dishonesty will be pursued to the fullest extent possible as stipulated by the Office of Student Conduct. It is very important for you to be aware of the consequences of cheating, fabrication, facilitation, and plagiarism. For more information on the Code of Academic Integrity or the Student Honor Council, please visit University of Maryland Code of Academic Integrity and Computer Science Department Academic Integrity Information.

    Excused Absences: Any student who needs to be excused for an absence from a single lecture, recitation, or lab due to a medically necessitated absence shall make a reasonable attempt to inform the instructor of his/her illness prior to the class. Upon returning to the class, they should present their instructor with a self-signed note attesting to the date of their illness. Each note must contain an acknowledgment by the student that the information provided is true and correct. Providing false information to University officials is prohibited under Part 9(i) of the Code of Student Conduct (V-1.00(B) University of Maryland Code of Student Conduct) and may result in disciplinary action. For further details, please see University of Maryland Policy on Excused Absence.

    Other Accommodations and Policies: Acknowledgements: Some course materials are adapted from the Stanford's CS231n Course and CMU's Multimodal Machine Learning Course. Thanks to the course instructors for sharing the slides.