This course is a detailed survey of optimization. Topics are mainly covered from a computational perspective, but theoretical issues are also addressed. Special emphasis will be put on scalable methods with applications in machine learning, model fitting, and image processing. There are no formal pre-requisites for this course, however students should have a strong background in applied mathematics (especially linear algebra) and computer programming.
Computational topics will include gradient methods, stochastic methods, Bayesian methods, splitting methods, interior point methods, linear programming, and methods for large matrices. Theoretical topics will include convex analysis, duality, rates of convergence, and advanced topics in linear algebra. Homework assignments will require both mathematical work on paper and implementation of algorithms. |

### The basics

Communications and questions for this course will be managed through the course slack channel. You can join the channel here.

Lectures will take place over zoom. The link for the lectures will be communicated through slack.

When: TuTh 11:00pm - 12:15pm | Where: online | |

Instructor: Tom Goldstein | Office Hours: Th 12:15-1:30pm, Iribe 4212 | |

TA: Manli Shu | Office Hours: TBD | manlis@umd.edu |

TA: Amin Ghiasi | Office Hours: TBD | ghiasi@umd.edu |

Final Exam: TBD

### Coursework & grading

Homework will be assigned approximately once per week. Homework assignments will consist of both programming exercises and theoretical problem sets. Programming assignments will be cumulative - you will need the results of early assignments to complete assignments that are given later in the course. The completed homework assignments you turn in must represent your own work.

The approximate grade breakdown of the course will be

- 50% homework
- 25% midterm exam
- 25% final exam.

I reserve the right to change the relative weighting of homework/exams at any time. Furthermore, because this course has gone online, the structure of exams will be quite different than we’ve had in the past and I’m still looking into different options.

Note: This is a PhD qualifying course for computer science.

### Homework

Assignments will be distributed as Jupyter notebooks using Python 3. If you’ve never used notebooks before, you can find a tutorial here. For each assignment, you’ll turn in both a notebook, **and a PDF** of your completed notebook.

Homework will be turned in using Elms/Canvas. Follow this link to Elms, and then look for the course page after logging in.

Homework 1: Linear algebra review (Due Tues Feb 9)

Solutions

Homework 2: More linear algebra (Due Wed Feb 24)

Solutions

Homework 3: Gradients (Due Thurs March 4)

Solutions

For this assignment, you’ll need to download the notebook importer, and place it in the same folder as the notebook.

Homework 4: PySmorch (Due Fri March 12th) Solutions

For this assignment, you’ll need the notebook importer, and also this utility script.

Homework 5: Convex Functions (Due Fri April 9th) Solutions

Homework 6: Gradient Methods and Duality

Solutions
For this assignment, you’ll need the *new* version of the
utility script.

Homework 7: Splitting

Solutions

### Lecture Slides

These slides evolve from year to year, sometimes dramatically. For this reason, slides might get updated shortly before or after a lecture.

Slides below have not been updated…
Course Overview

Linear Algebra Review

Convolutions and FFT

Optimization Problems

Machine learning basics

Calculus

Neural Net Design

Quadratic Forms

Convex Functions

Gradient Methods

Quasi-Newton Methods

Duality

Proximal Methods

Lagrangian Methods
Random topics | MCMC code example

Linear Programming

Bonus lecture: generalization
Semidefinite Programming

Interior Point Method

### Book & Other Sources

All course materials are available for free online. Suggested reading material for various topics includes:

- Derivatives and gradients: Convex Optimization by Boyd and Vandenberghe, Appendix A
- Numerical Linear Algebra: Numerical Linear Algebra by Trefethen and Bau
- Randomized matrix factorizations: Finding Structure with Randomness
- L1 models and sparsity: Sparse modeling for Image and Vision Processing
- Convex functions and gradient methods: Convex Optimization by Boyd and Vandenberghe
- Line search methods: Chapter 3 of Nocedal and Wright
- Convergence rates for gradient methods: Optimal Rates in Convex Optimization
- Quasi-Newton methods: Chapter 3 of Nocedal and Wright
- Proximal methods: A Field Guide to Forward-Backward Splitting
- ADMM: Fast Alternating Direction Optimization Methods
- Consensus ADMM: Distributed Optimization and Statistical Learning
- Unwrapped ADMM: Unwrapping ADMM
- PDHG: Adaptive Primal-Dual Hybrid Gradient Methods
- SGD: Incremental Gradient, Subgradient, and Proximal Methods
- SGD convergence rates: Stochastic Gradient Descent for Non-Smooth Optimization
- Monte-Carlo: An Introduction to MCMC for Machine Learning
- Barrier Methods: Convex Optimization by Boyd and Vandenberghe, chapter 11
- Primal-Dual Interior Point Methods: Nocedal and Wright, chapter 14
- Interior Point Methods overview: IP methods: 25 years later
- Semi-definite programming: Vandenbergh and Boyd
- Metric learning: Distance Metric Learning for LMNN