The basics
Communications and questions for this course will be managed through the course page on Piazza, which you can find here.
When: TuTh 11:00pm - 12:15pm | Where: CSI 3120 |
Instructor: Tom Goldstein | Office Hours: Th 1-2pm, AVW π 3141 |
TA: Hao Li | Office Hours: TBD |
TA: Sohil Shah | Office Hours: Wed 2-3, AVW 3364 |
Final Exam: TBD
Coursework & grading
Homework will be assigned approximately once per week. Homework assignments will consist of both programming exercises and theoretical problem sets. Programming assignments will be cumulative - you will need the results of early assignments to complete assignments that are given later in the course. The completed homework assignments you turn in must represent your own work.
You may complete your assignments in either Matlab or Python 2.7. Other languages (including Python 3+) will not be allowed. When programming assignments are given, you will be required to prepare a short pdf document containing outputs from your code, and this pdf will be turned in with your code.
The approximate grade breakdown of the course will be
- 50% homework
- 25% midterm exam
- 25% final exam.
Note: This is a PhD qualifying course for computer science.
Homework
Homework will be turned in using Elms/Canvas. Follow this link to Elms, and then look for the course page after logging in. I recommend generating your pdf using latex. Have a look at these example solutions, and the corresponding latex source.
Lecture Slides
Course Overview
Linear Algebra Review
Optimization Problems
TV, FFT, and Calculus
Quadratic Forms
Convex Functions
Gradient Methods
Quasi-Newton Methods
Duality
Proximal Methods
Lagrangian Methods
Random topics | MCMC code example
Book & Other Sources
All course materials are available for free online. Suggested reading material for various topics includes:
- Derivatives and gradients: Convex Optimization by Boyd and Vandenberghe, Appendix A
- Numerical Linear Algebra: Numerical Linear Algebra by Trefethen and Bau
- Randomized matrix factorizations: Finding Structure with Randomness
- L1 models and sparsity: Sparse modeling for Image and Vision Processing
- Convex functions and gradient methods: Convex Optimization by Boyd and Vandenberghe
- Convergence rates for gradient methods: Optimal Rates in Convex Optimization
- Proximal methods: A Field Guide to Forward-Backward Splitting
- ADMM: Fast Alternating Direction Optimization Methods
- Consensus ADMM: Distributed Optimization and Statistical Learning
- Unwrapped ADMM: Unwrapping ADMM
- PDHG: Adaptive Primal-Dual Hybrid Gradient Methods
- SGD: Incremental Gradient, Subgradient, and Proximal Methods
- SGD convergence rates: Stochastic Gradient Descent for Non-Smooth Optimization
- Monte-Carlo: An Introduction to MCMC for Machine Learning
- Barrier Methods: Convex Optimization by Boyd and Vandenberghe, chapter 11
- Primal-Dual Interior Point Methods: Nocedal and Wright, chapter 14
- Semi-definite programming: Vandenbergh and Boyd
- Metric learning: Distance Metric Learning for LMNN