It is well known that certain neural network architectures produce loss functions that train easier and generalize better, but the reasons for this are not well understood. To understand this better, we explore the structure of neural loss functions using a range of visualization methods.
A number of non-convex optimization problems can be convexified by “lifting” strategies. These methods yield convex formulations at the cost of substantially increased dimensionality. PhaseMax is a new type of convex relaxation that does not require lifting; it solves problems in their original low-dimensional parameter space.
FASTA (Fast Adaptive Shrinkage/ Thresholding Algorithm) is an efficient, easy-to-use implementation of the Forward-Backward Splitting (FBS) method (also known as the proximal gradient method) for regularized optimization problems. Many variations on FBS are available in FASTA, including the popular accelerated variant FISTA (Beck and Teboulle ’09), the adaptive stepsize rule SpaRSA
PDHG is a powerful splitting method that can solve a wide range of constrained and non-differentiable optimization problems. Unlike the popular ADMM method, the PDHG approach usually does not require expensive minimization sub-steps. We provide adaptive stepsize selection rules that automate the solver, while increasing its speed and robustness.