CMSC 828L Deep Learning

 

Staff

 

Professor David Jacobs AV Williams, 4421

Email: djacobs-at-cs

 

TAs:     Justin Terry Email: justinkterry – at - gmail

             Chen Zhu  Email: chenzhu-at-cs

 

Office Hours:

                    Monday, 5-6.  Justin.

                    Tuesday, 3-4.  David.

                    Wednesday, 10-11, David

                    Wednesday, 5-6. Justin.

                    Thursday, 4-6.  Chen.

                   Location: TA office hours will be in 4101 or 4103 AV Williams, depending on availability

                   (check both rooms).  Prof. Jacobs office hours will be in 4421 AV Williams.

 

 

Readings

 

The following two books are available on line.

Deep Learning, by Ian Goodfellow and Yoshua Bengio and Aaron Courville

Neural Networks and Deep Learning, by Michael Nielsen

 

Other reading material appears in the schedule below.

 

Requirements

 

Students registered for this class must complete the following assignments:

 

Presentation: Students will form groups of 3 students.  Each group will prepare a 30 minute presentation on a topic of their choice.  Students will select two papers, and will present a summary and critical analysis of the material in these papers, along with any other appropriate background or related material.  Students should video record their presentation and submit a link to their video.  Presentations will be graded on the choice of topic (is the material interesting), clarity of presentation (do we understand the key points), focus (does the presentation highlight the most important parts of the work, rather than uniformly summarizing everything), and analysis (does the presentation help us understand the strengths and limitations of the presented work).  Six leading presentations will be selected for live presentation to the full class.

Problem Sets: There will be three problem sets assigned during the course.  These will include programming projects and may also include written exercises.

Midterm: There will be a one week, take-home midterm.  This will include paper and pencil exercises.

Final Exam: There will be an in-class final exam.

 

Course Policies

 

 

 

 

 

Course work, late policies, and grading

Homework and the take-home midterm are due at the start of class. Problems may be turned in late, with a penalty of 10% for each day they are late, but may not be turned in after the start of the next class after they are due. For example, if a problem set is due on Tuesday, it may be turned in before Wednesday at 12:30pm, with a 10% penalty, or before Thursday at 12:30pm, with a 20% penalty, but no later than Thursday at 12:30pm.

Some homeworks and the exams may have a special challenge problem. Points from the challenge problems are extra credit. This means that I do not consider these points until after the final course grade cutoffs have been set. Students participating in class discussion or asking good questions may also receive extra credit.

Each problem set and the presentation will count for 10% of the final grade.  The midterm will count for 20%, and the final will count for 40%.

 

 

Academic Honesty

All class work is to be done independently. You are allowed to discuss class material, homework problems, and general solution strategies with your classmates. When it comes to formulating/writing/programming solutions you must work alone. If you make use of other sources in coming up with your answers you must cite these sources clearly (papers or books in the literature, friends or classmates, information downloaded from the web, whatever).

It is best to try to solve problems on your own, since problem solving is an important component of the course. But I will not deduct points if you make use of outside help, provided that you cite your sources clearly. Representing other people's work as your own, however, is plagiarism and is in violation of university policies. Instances of academic dishonesty will be dealt with harshly, and usually result in a hearing in front of a student honor council, and a grade of XF. (Note, this and other course policies are taken from those of Prof. David Mount).



Absences

Any student who needs to be excused for an absence from a single lecture, recitation, or lab due to a medically necessitated absence shall: a) Make a reasonable attempt to inform the instructor of his/her illness prior to the class. b) Upon returning to the class, present their instructor with a self-signed note attesting to the date of their illness. Each note must contain an acknowledgment by the student that the information provided is true and correct. Providing false information to University officials is prohibited under Part 9(h) of the Code of Student Conduct (V-1.00(B) University of Maryland Code of Student Conduct) and may result in disciplinary action. The self-documentation may not be used for the Major Scheduled Grading Events as defined below and it may only be used for only 1 class meeting (or more, if you choose) during the semester. Any student who needs to be excused for a prolonged absence (2 or more consecutive class meetings), or for a Major Scheduled Grading Event, must provide written documentation of the illness from the Health Center or from an outside health care provider. This documentation must verify dates of treatment and indicate the timeframe that the student was unable to meet academic responsibilities. In addition, it must contain the name and phone number of the medical service provider to be used if verification is needed. No diagnostic information will ever be requested. The Major Scheduled Grading Events for this course include: the Final exam, as given in University schedule.

Academic Accommodations

Any student eligible for and requesting reasonable academic accommodations due to a disability is requested to provide, to the instructor in office hours, a letter of accommodation from the Office of Disability Support Services (DSS) within the first two weeks of the semester.

 

 

Assignments

Assigned

Due

Problem Set 1

9/11/18

9/25/18

Problem Set 2

9/25/18

10/9/18

Problem Set 3

10/9/18

10/23/18

Midterm

10/23/18

10/30/18

Presentation

 

11/13/18

 

 

Tentative Schedule

 

 

 

Date

Topic

Presenters

Reading

Class 1

8/28

Introduction

 

 

Class 2

8/30

Intro to Machine Learning

 

Deep Learning, Chapter 5

Class 3

9/4

Intro to Machine Learning: Linear models (SVMs and Perceptrons, logistic regression)

 

For Logistic Regression see this chapter from Cosmo Shalizi

Class 4

9/6

Intro to Neural Nets: What a network computes.

 

Deep Learning, Chapter 6

 

Neural Networks and Deep Learning, Chapter 2

Class 5

9/11

Training a network: loss functions, backpropagation .

 

A tutorial on energy based learning, by Lecun et al.

 

Neural Networks and Deep Learning, Chapter 3

Class 6

9/13

Neural networks as universal function approximators

 

Approximation by superpositions of a sigmoidal function, by George Cybenko (1989). 

 

Multilayer feedforward networks are universal approximators, by Kurt Hornik, Maxwell Stinchcombe, and Halbert White (1989)

 

Neural Networks and Deep Learning, Chapter 4

 

Class 7

9/18

Convolution and Fourier Transforms

 

Convolution and Fourier Transforms

Class 8

9/20

CNNs contŐd

Stochastic Gradient Descent, batch normalization, Siamese networks, early stopping, transfer learning, brief history of neural networks.

Deep Learning, Chapter 7

 

Deep Learning, Chapter 9

Class 9

9/25

Implementation of deep learning.  Deep learning frameworks and the software stack, hyperparameter optimization, hardware acceleration, debugging.

Justin

 

Class 10

9/27

Implementation of deep learning, contŐd

Justin

Class 11

10/2

Deeper networks.  The vanishing gradient, skip connections, resnet.

 

Very Deep Convolutional Neural Networks for Large-Scale Image Recognition, by Simonyan and Zisserman

 

Deep Residual Learning for Image Recognition by He et al.

 

Residual Networks are Exponential Ensembles of Relatively Shallow Networks by Veit et al.

 

Densely Connected Convolutional Neural Networks by Huang et al.

 

Also of interest:

 

Neural Networks and Deep Learning Chapter 5

 

On the Difficulty of Training Recurrent Neural Networks by Pascanu et al.

Class 12

10/4

Optimization.  Convex vs. non-convex functions.  Convergence of GD and SGD, Adam optimizer, initialization, leaky RELU, Momentum, Changing step sizes.

 

Neural Networks and Deep Learning Chapter 8

Class 13

10/9

Convergence in deep networks.  Minima that do/donŐt generalize. Broad vs. narrow minima.  GD vs. SGD.  The loss landscape.

Understanding deep learning requires rethinking generalization, by Zhang et al.

 

Visualizing the loss landscape of neural nets, by Li et al.

 

VC Dimension and Rademacher compextiy are discussed in many places, eg., these notes.

 

Keskar, Nitish Shirish, et al. "On large-batch training for deep learning: Generalization gap and sharp minima."

 

Class 14

10/11

Dimensionality reduction, linear (PCA, LDA) and manifolds, random projections.

PCA (slides from Olga Veksler)

 

LDA (slides from Olga Veksler)

 

An elementary proof of the Johnson-Lindenstrauss Lemma, by Dasgupta and Gupta 

Class 15

10/16

Low-dimensional embedding, metric learning

Efficient Estimation of Word Representations in Vector Space by Mikolov et al.

 

Facenet: a Unified Embedding for Face Recognition and Clustering by Schroff et al.

 

Metric Learning, a Survey, by Brian Kulis

Class 16

10/18

Adversarial attacks

 

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey, by Akhtar and Mian

 

Intriguing Properties of Neural Networks, by Szegedy et al.

 

Explaining and Harnessing Adversarial Examples by Goodfellow et al.

 

A Boundary Tilting Perspective on the Phenomenon of Adversarial Examples by Tanay and Griffith

 

Poison Frogs! Targeted Clean Label Attacks on Neural Networks by Shafahi et al.

Class 17

10/23

AI Safety and the future of AI

Justin

Class 18

10/25

Autoencoders, Variational Autoencoders,  and dimensionality reduction in networks

Chen

Deep Learning, Chapter 14

 

Tutorial on Variational Autoencoder, by Carl Doersch

Class 19

10/30

Generative models, GANs.

 

 

Generative Adversarial Networks by Goodfellow et al.

 

Towards Principled Methods of Training Generative Adversarial Networks by Arjovsky and Bettou

 

Wasserstein GAN by Arjovsky et al.

 

 

Class 20

11/1

Go over midterm.

 

Image-to-image translation

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks by Zhu et al.

 

Class 21

11/6

Reinforcement learning

Reinforcement Learning, An Introduction by Sutton and Barto

 

Understanding Chapters 3 and 6 is important, but reading 4 and 5 will probably help with 6.  Chapter 1 is fun and quick to read.

Class 22

11/8

Deep reinforcement learning

Reinforcement Learning, An Introduction by Sutton and Barto

 

Deep Learning Sections 16.1, 16.5, 16.6

Class 23

11/13

Why are deep networks better than shallow?

G. F. Montufar, R. Pascanu, K. Cho, and Y. Bengio. On the number of linear regions of deep neural networks. In NIPS, pages 2924–2932, 2014.

The Power of Depth for Feedforward Neural Networks 
Ronen Eldan and Ohad Shamir 
29th Conference on Learning Theory

 

Benefits of depth in neural networks Matus Telgarsky

Class 24

11/15

Catching up on previous topics.

Class 25

11/20

Recurrent neural nets.

Deep Learning, Chapter 10, especially from the beginning through 10.2, and Section 10.10

Class 26

11/27

Student presentations

Visual Question Answering -- Ishita, Pranav and Shlok

 

Bayesian Deep Learning -- Sam and Susmija

DonŐt Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering by Agrawal et al.

 

 

Gal, Yarin, and Zoubin Ghahramani. "Dropout as a Bayesian approximation: Representing model uncertainty in deep learning." 

 

Class 27

11/29

Conclusions

 

 

Class 28

12/4

Student presentations

Mansi, Sahil, and Saumya – Capsule Networks

 

Kamal, Sneha, and Uttaran – Graph Convolutional Networks

Sara Sabour, Nicholas Frosst, Geoffrey Hinton, Dynamic Routing Between Capsules

 

Spectral Networks and Locally Connected Networks on Graphs
Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann LeCun

 

Class 29

12/6

Student presentations

Samuel, Alex and Alex -- Memory Augmented Neural Networks and Meta-Learning

 

Abhishek, Nirat, Snehesh, Chahat – Depth, Pose, and Flow from Images.

One Shot Learning with Memory-Augmented Neural Networks, by Santoro et al.

 

 

Yin and Shi, GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

Final

12/17

1:30-3:30