CMSC 828L Deep Learning
Staff
Professor David Jacobs AV Williams, 4421
Office Hours: Tuesday, 1112
djacobsatcs
TAs: Chengxi Yi (yechengxiatgmail)
Angjoo Kanazawa (firstname. lastnameatgmail)
Soumyadip Sengupta (senguptajuetceatgmail)
Jin Sun (firstnamelastnameatcs)
Hao Zhou (zhhoperatgmail)
^{ }
Readings
Much of the reading for
class will come from two books available online.
Neural Networks and Deep
Learning, by Michael Nielsen
Other reading material appears in the schedule below.
Requirements
Students registered for this class must complete the following assignments:
Presentation: Students will form eight groups of four students each. Each group will be responsible for one class. They will present papers and lead a discussion on one of the discussion topics listed on the schedule. Discussion topics are marked in blue (applications) and red (more theoretical material). Professor Jacobs will lead the discussion for topics not selected by the students. Note that there is room on the schedule for some groups to suggest their own topics. Presentations will be graded according to the following rubric.
Paper Summaries: For eight of the discussion classes, students must turn in a one page summary of one of the papers to be discussed on that day. Summaries should contain one paragraph that summarizes the paper, and one paragraph that provides some analysis of the work in the paper, including suggestions for possible questions to discuss. Summaries must be handed in before the start of class, and students must attend class on the days in which they hand in summaries.
Problem Sets: There will be three problem sets assigned during the course. These will include programming projects and may also include written exercises.
Final Project: Students will undertake a final project for the class. These may be done alone or in teams. Students should discuss their topic with the professor.
Assignments
Problem Set 
Assigned 
Due 
9/20/16 
10/11/16 

10/18/16 
11/8/16 

Final Project 

12/8/16 
Tentative Schedule

Date 
Topic 
Presenters 
Reading 
Class 1 
8/30 
Introduction 


Class 2 
9/1 
Intro to Machine Learning: 

Deep Learning, Chapter 5 
Class 3 
9/6 
Intro to Machine Learning: Linear models (SVMs and Perceptrons, logistic regression) 

For Logistic Regression see this chapter from Cosmo Shalizi 
Class 4 
9/8 
Intro to Neural Nets: What a shallow network computes. 

Deep Learning, Chapter 6 Neural Networks and Deep Learning, Chapter 2 
Class 5 
9/13 
Training a network: loss functions, backpropagation and stochastic gradient descent. 

A tutorial on energy based learning, by Lecun et al. Neural Networks and Deep Learning, Chapter 3 
Class 6 
9/15 
Neural networks as universal function approximators 

Approximation by superpositions of a sigmoidal function, by
George Cybenko (1989). Multilayer
feedforward networks are universal approximators, by
Kurt Hornik, Maxwell Stinchcombe,
and Halbert White (1989) Neural Networks and Deep Learning, Chapter 4 
Class 7 
9/20 
Deep Networks: Backpropagation and regularization, batch normalization 

Deep Learning, Chapter 7 
Class 8 
9/22 
VC Dimension and Neural Nets 
David 
VC
Dimension of Neural Networks, by Sontag 
Class 9 
9/27 
Why are deep networks better than shallow? 
David 
G. F. Montufar, R. Pascanu,
K. Cho, and Y. Bengio. On
the number of linear regions of deep neural networks. In NIPS,
pages 2924–2932, 2014. The Power of Depth for Feedforward Neural Networks 
Class 10 
9/29 
Why are deep networks better than
shallow? 
David 
Benefits
of depth in neural networks Matus
Telgarsky 
Class 11 
10/4 
Convolutional Networks 

Deep Learning,
Chapter 9 
Class 12 
10/6 
Applications: Imagenet 
David 
ImageNet Classification with Deep Convolutional Neural
Nets by Krivhevsky et al. Very Deep Convolutional Neural
Networks for LargeScale Image Recognition, by Simonyan
and Zisserman Deep Residual Learning for Image
Recognition by He et al. Residual Networks are Exponential
Ensembles of Relatively Shallow Networks by Veit
et al. Also of interest: Neural Networks and
Deep Learning Chapter
5 On the
Difficulty of Training Recurrent Neural Networks by Pascanu
et al. 
Class 13 
10/11 ECCV 
Applications: Detection 
Ankan, Upal,
Amit, Weian 
Rich
feature hierarchies for accurate object detection and semantic segmentation
by Girshick et al. 
Class 14 
10/13 ECCV 
Audio 
Jiao,
Philip 
WaveNet:
A Generative Model for Raw Audio by van den Oord
et al. See also the Wavenet
blog
post 
Class 15 
10/18 
What does a neuron compute? 
Nitin, Kiran 
Visualizing and Understanding Convolutional Networks by Zeiler and Fergus 
Class 16 
10/20 
Dimensionality reduction, linear (PCA, LDA) and manifolds, metric learning 

PCA (slides from Olga Veksler) LDA (slides from Olga Veksler) Metric Learning, a Survey, by Brian Kulis An elementary proof of the JohnsonLindenstrauss Lemma, by Dasgupta and
Gupta 
Class 17 
10/25 
Autoencoders and dimensionality reduction in networks 

Deep Learning, Chapter 14 
Class 18 
10/27 
Applications: Natural Language
Processing (eg., Word2vec) 
Amr, Prudhui, Sanghyun, Faez 
Efficient Estimation of Word Representations in Vector Space by Mikolov et al. 
Class 19 
11/1 
Applications: Joint Detection 
Chinmaya, Huaijen, Ahmed, Spandan 
Convolutional Pose Machines by Wei et al. Stacked Hourglass Networks for Human Pose Estimation by Newell et al. Recurrent
Network Models for Human Dynamics by Fragkiadaki 
Class 20 
11/3 
Neuroscience: What does a neuron
do? 
David 
Spiking Neuron Models (Cambridge Univ. Press) Chapter 1 and Sections 10.1, 10.2 
Class 21 
11/8 
Applications: Bioinformatics 
Somay, Jay, Varun, Ashwin 
Predicting effects of noncoding variants with deep learning–based sequence model by Zhou and Troyanskaya 
Class 22 
11/10 
Optimization in Deep Networks 
Zheng 
The Loss Surfaces of Multilayer Neural Networks by Choromanska et al. No Bad Local Minima: Data independent training error guarantees for multilayer neural networks by Soudry and Carmon 
Class 23 
11/15 
Generalization in Neural Networks 
David 
Generative
Adversarial Networks by Goodfellow
et al. Margin Preservation of Deep
Neural Networks by Sokolic 
Class 24 
11/17 
Applications: Face recognition 
Hui, Huijing, Mustafa 
Deepface: Closing the Gap to Human Level Performance in Face Verification by Taigman et al. Facenet: a Unified Embedding for Face Recognition and Clustering by Schroff et al. Deep Face Recognition by Parkhi et al. 
Class 25 
11/22 
Spatial Transformer Networks 
Angjoo 
Spatial
Transformer Networks by Jaderberg et al. WarpNet:
Weakly Supervised Matching for Singleview Reconstruction by Kanazawa et
al. 
Class 26 
11/29 
Recurrent networks, LSTM 


Class 27 
12/1 
Applications: Scene
Understanding 
Abhay,
Rajeev, Palabi 
Attend, Infer,
Repeat: Fast Scene Understanding with Generative Models by Eslami, et
al. 
Class 28 
12/6 NIPS 
Applications: Generating Image Captions 
Mingze, Chirag, Wei, Yanzhou 
Deep
Fragment Embeddings for Bidirectional Image
Sentence Mapping by Karpathy, et al Deep
VisualSemantic Alignments for Generating Image Descriptions by Karpathy, et al DenseCap:
Fully Convolutional Localization Networks for Dense Captioning
by Johnson et al 
Class 29 
12/8 NIPS 
Overview discussion: 
David 
Building Machines That Learn and
Think Like People by Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum,
and Samuel J. Gershman 