The power of neural networks lies in their ability to generalize to unseen data, yet the underlying reasons for this phenomenon remain elusive. Numerous rigorous attempts have been made to explain generalization, but available bounds are still quite loose, and analysis does not always lead to true understanding. The goal of this work is to make generalization more intuitive. Using visualization methods, we discuss the mystery of generalization, the geometry of loss landscapes, and how the curse (or, rather, the blessing) of dimensionality causes optimizers to settle into minima that generalize well.

Attacks on copyright systems

in machine learning, adversarial learning

2019-05-28

Overview

Copyright detection systems are among the most widely used machine learning systems in industry, and the security of these systems is of foundational importance to some of the largest companies in the world. Examples include YouTube’s Content ID, which has resulted in more than 3 billion dollars in revenue for copyright holders, and Google Jigsaw, which has been developed to detect and remove videos that promote terrorism or jeopardized national security.

Adversarial training for FREE!

in machine learning, adversarial learning

2019-03-08

“Adversarial training,” in which a network is trained on adversarial examples, is one of the few defenses against adversarial attacks that withstands strong attacks. Unfortunately, the high cost of generating strong adversarial examples makes standard adversarial training impractical on large-scale problems like ImageNet. We present an algorithm that eliminates the overhead cost of generating adversarial examples by recycling the gradient information computed when updating model parameters.

Our “free” adversarial training algorithm is comparable to state-of-the-art methods on CIFAR-10 and CIFAR-100 datasets at negligible additional cost compared to natural training, and can be 7 to 30 times faster than other strong adversarial training methods.

Stacked U-Nets: A simple architecture for image segmentation

in machine learning

2018-06-01

Many imaging tasks require global information about all pixels in an image. For example, the output of an image classifier may depend on many pixels in separate regions of an image. For image segmentation, in which a neural network must produce a high-resolution map of classifications rather than a single output, each pixel’s label may depend on information from far away pixels.

Conventional bottom-up classification networks globalize information by decreasing resolution; features are pooled and downsampled into a single output that “sees” the whole image. But for semantic segmentation, object detection, and other image-to-image regression tasks, a network must preserve and output high-resolution maps, and so pooling alone is not an option. To globalize information while preserving resolution, many researchers propose the inclusion of sophisticated auxiliary blocks, but these come at the cost of a considerable increase in network size, computational cost, and implementation complexity.

Visualizing the Loss Landscape of Neural Nets

in optimization, machine learning

2018-01-05

Neural network training relies on our ability to find “good” minimizers of highly non-convex loss functions. It is well known that certain network architecture designs (e.g., skip connections) produce loss functions that train easier, and well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers that generalize better. However, the reasons for these differences, and their effects on the underlying loss landscape, are not well understood.

Stabilizing GANs with Prediction

in optimization, machine learning

2017-12-11

Why are GANs hard to train?

Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when convergence does happen it is often highly sensitive to learning rates.

PhasePack

in optimization

2017-11-20

PhasePack

A phase retrieval library

Phase retrieval is the recovery of a signal from only the magnitudes, and not the phases, of complex-valued linear measurements. Phase retrieval problems arise in many different applications, particularly in crystallography and microscopy. Mathematically, phase retrieval recovers a complex valued signal \(x\in \mathbb{C}^n\) from \(m\) measurements of the form

Distributed Machine Learning

in machine learning

2016-11-27

Distributed

Machine Learning

-

Classical machine learning methods, include stochastic gradient descent (also known as backprop), work great on one machine, but don’t scale well to the cloud or cluster setting. We propose a variety of algorithmic frameworks for scaling machine learning across many workers. Many of our distributed ML experiments are done using USNA’s Grace Supercomputer, which is currently hosted at University of Maryland.

The stone transform: flexible compressed sensing

in image processing

2014-07-01

Overview

Compressive sensing enables the reconstruction of high-resolution signals from under-sampled data. While compressive methods simplify data acquisition, they require the solution of difficult recovery problems to make use of the resulting measurements. The stone transform is a new sensing framework that combines the advantages of both conventional and compressive sensing. Using the proposed stone transform, measurements can be reconstructed instantly at Nyquist rates at any power-of-two resolution. The same data can then be “enhanced” to higher resolutions using compressive methods that leverage sparsity to “beat” the Nyquist limit. The availability of a fast direct reconstruction enables compressive measurements to be processed on small embedded devices. We demonstrate this by constructing a real-time compressive video camera.

Optimization

Overview

Why are GANs hard to train?

PhasePack

A phase retrieval library

Distributed

Machine Learning

Overview