CMSC 828L Deep Learning
Staff
Professor David Jacobs AV Williams, 4421
Email: djacobsatcs
TAs: Justin Terry Email: justinkterry – at  gmail
Chen Zhu Email: chenzhuatcs
Office Hours:
Monday, 56. Justin.
Tuesday, 34. David.
Wednesday, 1011, David
Wednesday, 56. Justin.
Thursday, 46. Chen.
Location: TA office hours will be in 4101 or 4103 AV Williams, depending on availability
(check both rooms). Prof. Jacobs office hours will be in 4421 AV Williams.
^{ }
Readings
The following two books are
available on line.
Neural Networks and Deep
Learning, by Michael Nielsen
Other reading material appears in the schedule below.
Requirements
Students registered for this class must complete the following assignments:
Presentation: Students will form groups of 3 students. Each group will prepare a 30 minute presentation on a topic of their choice. Students will select two papers, and will present a summary and critical analysis of the material in these papers, along with any other appropriate background or related material. Students should video record their presentation and submit a link to their video. Presentations will be graded on the choice of topic (is the material interesting), clarity of presentation (do we understand the key points), focus (does the presentation highlight the most important parts of the work, rather than uniformly summarizing everything), and analysis (does the presentation help us understand the strengths and limitations of the presented work). Six leading presentations will be selected for live presentation to the full class.
Problem Sets: There will be three problem sets assigned during the course. These will include programming projects and may also include written exercises.
Midterm: There will be a one week, takehome midterm. This will include paper and pencil exercises.
Final Exam: There will be an inclass final exam.
Course Policies 

Course work, late policies, and grading 
Homework
and the takehome midterm are due at the start of class. Problems may be
turned in late, with a penalty of 10% for each day they are late, but may not
be turned in after the start of the next class after they are due. For
example, if a problem set is due on Tuesday, it may be turned in before
Wednesday at 12:30pm, with a 10% penalty, or before Thursday at 12:30pm, with
a 20% penalty, but no later than Thursday at 12:30pm. Some homeworks and the exams may have a special challenge problem. Points from the challenge problems are extra credit. This means that I do not consider these points until after the final course grade cutoffs have been set. Students participating in class discussion or asking good questions may also receive extra credit. Each problem set and the presentation will count for 10% of the final grade. The midterm will count for 20%, and the final will count for 40%. 
Academic Honesty 
All
class work is to be done independently. You are allowed to discuss class
material, homework problems, and general solution strategies with your
classmates. When it comes to formulating/writing/programming solutions you
must work alone. If you make use of other sources in coming up with your
answers you must cite these sources clearly (papers or books in the
literature, friends or classmates, information downloaded from the web,
whatever). It is best to try to solve problems on your own, since problem solving is an important component of the course. But I will not deduct points if you make use of outside help, provided that you cite your sources clearly. Representing other people's work as your own, however, is plagiarism and is in violation of university policies. Instances of academic dishonesty will be dealt with harshly, and usually result in a hearing in front of a student honor council, and a grade of XF. (Note, this and other course policies are taken from those of Prof. David Mount). 

Any
student who needs to be excused for an absence from a single lecture,
recitation, or lab due to a medically necessitated absence shall: a) Make a
reasonable attempt to inform the instructor of his/her illness prior to the
class. b) Upon returning to the class, present their instructor with a
selfsigned note attesting to the date of their illness. Each note must
contain an acknowledgment by the student that the information provided is
true and correct. Providing false information to University officials is
prohibited under Part 9(h) of the Code of Student Conduct (V1.00(B) University of Maryland Code of Student Conduct)
and may result in disciplinary action. The selfdocumentation may not be used
for the Major Scheduled Grading Events as defined below and it may only be
used for only 1 class meeting (or more, if you choose) during the semester. Any student who needs to be excused for a prolonged absence (2 or
more consecutive class meetings), or for a Major Scheduled Grading Event,
must provide written documentation of the illness from the Health Center or
from an outside health care provider. This documentation must verify dates of
treatment and indicate the timeframe that the student was unable to meet
academic responsibilities. In addition, it must contain the name and phone
number of the medical service provider to be used if verification is needed.
No diagnostic information will ever be requested. The Major Scheduled Grading
Events for this course include: the Final exam, as given in University
schedule. 
Academic Accommodations 
Any
student eligible for and requesting reasonable academic accommodations due to
a disability is requested to provide, to the instructor in office hours, a
letter of accommodation from the Office of Disability Support Services (DSS)
within the first two weeks of the semester. 
Assignments
Assigned 
Due 

9/11/18 
9/25/18 

9/25/18 
10/9/18 

10/9/18 
10/23/18 

10/23/18 
10/30/18 

Presentation 

11/13/18 
Tentative Schedule

Date 
Topic 
Presenters 
Reading 
Class 1 
8/28 
Introduction 


Class 2 
8/30 
Intro to Machine Learning 

Deep Learning, Chapter 5 
Class 3 
9/4 
Intro to Machine Learning: Linear models (SVMs and Perceptrons, logistic regression) 

For Logistic Regression see this chapter from Cosmo Shalizi 
Class 4 
9/6 
Intro to Neural Nets: What a network computes. 

Deep Learning, Chapter 6 Neural Networks and Deep Learning, Chapter 2 
Class 5 
9/11 
Training a network: loss functions, backpropagation . 

A tutorial on energy based learning, by Lecun et al. Neural Networks and Deep Learning, Chapter 3 
Class 6 
9/13 
Neural networks as universal function approximators 

Approximation by superpositions
of a sigmoidal function,
by George Cybenko (1989). Multilayer feedforward
networks are universal approximators, by Kurt Hornik,
Maxwell Stinchcombe, and Halbert
White (1989) Neural Networks and Deep Learning, Chapter 4 
Class 7 
9/18 
Convolution and Fourier Transforms 


Class 8 
9/20 
CNNs contŐd Stochastic Gradient Descent, batch normalization, Siamese
networks, early stopping, transfer learning, brief
history of neural networks. 
Deep Learning, Chapter 7 Deep Learning, Chapter 9 

Class 9 
9/25 
Implementation of deep learning. Deep learning frameworks and the software stack, hyperparameter optimization, hardware acceleration, debugging. 
Justin 

Class 10 
9/27 
Implementation of deep learning, contŐd 
Justin 

Class 11 
10/2 
Deeper networks. The vanishing gradient, skip connections, resnet. 

Very Deep Convolutional Neural
Networks for LargeScale Image Recognition, by Simonyan
and Zisserman Deep Residual Learning for Image
Recognition by He et al. Residual Networks are Exponential
Ensembles of Relatively Shallow Networks by Veit
et al. Densely
Connected Convolutional Neural Networks by Huang et al. Also of interest: Neural Networks and
Deep Learning Chapter
5 On the
Difficulty of Training Recurrent Neural Networks by Pascanu
et al. 
Class 12 
10/4 
Optimization. Convex vs. nonconvex functions. Convergence of GD and SGD, Adam optimizer, initialization, leaky RELU, Momentum, Changing step sizes. 
Neural Networks and
Deep Learning Chapter 8 

Class 13 
10/9 
Convergence in deep networks. Minima that do/donŐt generalize. Broad vs. narrow minima. GD vs. SGD. The loss landscape. 
Understanding deep learning
requires rethinking generalization, by Zhang et al. Visualizing the loss landscape of
neural nets, by Li et al. VC Dimension and Rademacher compextiy are discussed in many places, eg., these
notes. Keskar, Nitish Shirish, et al. "On largebatch training for deep
learning: Generalization gap and sharp minima." 

Class 14 
10/11 
Dimensionality reduction, linear (PCA, LDA) and manifolds, random projections. 
PCA (slides from Olga Veksler) LDA (slides from Olga Veksler) An elementary proof of the JohnsonLindenstrauss Lemma, by Dasgupta and
Gupta 

Class 15 
10/16 
Lowdimensional embedding, metric learning 
Efficient Estimation of Word Representations in Vector Space by Mikolov et al. Facenet: a Unified Embedding for Face Recognition and Clustering by Schroff et al. Metric Learning, a Survey, by Brian Kulis 

Class 16 
10/18 
Adversarial attacks 

Threat of
Adversarial Attacks on Deep Learning in Computer Vision: A Survey, by Akhtar and Mian Intriguing Properties
of Neural Networks, by Szegedy et al. Explaining
and Harnessing Adversarial Examples by Goodfellow
et al. A
Boundary Tilting Perspective on the Phenomenon of Adversarial Examples by
Tanay and Griffith Poison
Frogs! Targeted Clean Label Attacks on Neural Networks by Shafahi et al. 
Class 17 
10/23 
AI Safety and the future of AI 
Justin 

Class 18 
10/25 
Autoencoders, Variational Autoencoders, and dimensionality reduction in networks 
Chen 
Deep Learning, Chapter 14 Tutorial on
Variational Autoencoder,
by Carl Doersch 
Class 19 
10/30 
Generative models, GANs. 

Generative
Adversarial Networks by Goodfellow
et al. Towards Principled Methods of
Training Generative Adversarial Networks by Arjovsky
and Bettou Wasserstein
GAN by Arjovsky et al. 
Class 20 
11/1 
Go over midterm. Imagetoimage translation 
Unpaired ImagetoImage Translation Using CycleConsistent Adversarial Networks by Zhu et al. 

Class 21 
11/6 
Reinforcement learning 
Reinforcement Learning, An Introduction by Sutton and Barto Understanding Chapters 3 and 6 is important, but reading 4 and 5 will probably help with 6. Chapter 1 is fun and quick to read. 

Class 22 
11/8 
Deep reinforcement learning 
Reinforcement Learning, An Introduction by Sutton and Barto Deep Learning Sections 16.1, 16.5, 16.6 

Class 23 
11/13 
Why are deep networks better than shallow? 
G. F. Montufar, R. Pascanu,
K. Cho, and Y. Bengio. On
the number of linear regions of deep neural networks. In NIPS,
pages 2924–2932, 2014. The Power of Depth for Feedforward Neural Networks Benefits of
depth in neural networks Matus
Telgarsky 

Class 24 
11/15 
Catching up on previous topics. 

Class 25 
11/20 
Recurrent neural nets. 
Deep Learning, Chapter 10, especially from the beginning
through 10.2, and Section 10.10 

Class 26 
11/27 
Student presentations 
Visual Question Answering  Ishita, Pranav and Shlok Bayesian Deep Learning  Sam and Susmija 
DonŐt Just Assume; Look and
Answer: Overcoming Priors for Visual Question Answering by Agrawal et al. Gal,
Yarin, and Zoubin Ghahramani. "Dropout as a Bayesian
approximation: Representing model uncertainty in deep learning." 
Class 27 
11/29 
Conclusions 


Class 28 
12/4 
Student presentations 
Mansi, Sahil, and Saumya – Capsule Networks Kamal, Sneha, and Uttaran – Graph Convolutional Networks 
Sara Sabour,
Nicholas Frosst, Geoffrey Hinton, Dynamic
Routing Between Capsules Spectral
Networks and Locally Connected Networks on Graphs 
Class 29 
12/6 
Student presentations 
Samuel, Alex and Alex  Memory Augmented Neural Networks and MetaLearning Abhishek, Nirat, Snehesh, Chahat – Depth, Pose, and Flow from Images. 
One Shot Learning with
MemoryAugmented Neural Networks, by Santoro et al. Yin and Shi, GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose 
Final 
12/17 1:303:30 


