It is well known that certain neural network architectures produce loss functions that train easier and generalize better, but the reasons for this are not well understood. To understand this better, we explore the structure of neural loss functions using a range of visualization methods.
Adversarial networks are notoriously hard to train, and simple training methods often collapse. We present a simple modification to the standard training method that increases stability. The method is provably stable for a class of saddle-point problems, and improves performance of numerous GANs.
Neural net parameters can often be compressed down to just one single bit without a significant loss in network performance, yielding a huge reduction in memory footprint and computational workload. We develop a theory of quantized nets, and explain the performance of algorithms for weight quantization.
A number of non-convex optimization problems can be convexified by “lifting” strategies. These methods yield convex formulations at the cost of substantially increased dimensionality. PhaseMax is a new type of convex relaxation that does not require lifting; it solves problems in their original low-dimensional parameter space.