Neural networks can generalize to test data that aren’t seen during training. The origins of generalization are mysterious and have eluded understanding. We try to gain an intuitive grasp on generalization through carefully crafted experiments.
Adversarial training is the best way to harden a neural net against attacks, but it costs 10-100X more than regular training. We show how to do adversarial training with no added cost, and then train a robust ImageNet model on a desktop computer in just a day.
Stacked U-Nets are simple, easy-to-train neural architecture for image segmentation and other image-to-image regression tasks. SUNets attain state of the art performance and fast inference with very few parameters.
It is well known that certain neural network architectures produce loss functions that train easier and generalize better, but the reasons for this are not well understood. To understand this better, we explore the structure of neural loss functions using a range of visualization methods.
Adversarial networks are notoriously hard to train, and simple training methods often collapse. We present a simple modification to the standard training method that increases stability. The method is provably stable for a class of saddle-point problems, and improves performance of numerous GANs.
Classical machine learning methods, include stochastic gradient descent (aka backprop), work great on one machine, but don’t scale well to the cloud or cluster setting. We propose a variety of algorithmic frameworks for scaling machine learning across many workers.