Training Quantized Nets: A Deeper Understanding

Why quantized nets?

Deep neural networks are an integral part of state-of-the-art computer vision and natural language processing systems. Because of their high memory requirements and computational complexity, networks are usually trained using powerful hardware. There is an increasing interest in training and deploying neural networks directly on battery-powered devices, such as cell phones or other platforms. Such low-power embedded systems are memory and power limited, and in some cases lack basic support for floating-point arithmetic.

To make neural nets practical on embedded systems, many researchers have focused on training nets with coarsely quantized weights. For example, weights may be constrained to take on integer/binary values, or may be represented using low-precision fixed-point numbers. In fact, for many applications, weights can be binarized – represented using just 1 bit – without a significant loss in network performance.

Unfortunately, training methods for binarized networks still require high-precision computations (even though the end result is a low-precision net), and so low-precision computation is difficult to exploit to speed up training.

A theory for quantized nets

Numerous recent publications have studied methods for training quantized networks, but these studies have mostly been empirical. People generally rely on a number of tricks and hacks, and specialized training methods that sometimes perform well and sometimes perform poorly. Furthermore, it is unclear why full precision is needed to train quantized and binarized nets.

In this work, we investigate training methods for quantized neural networks from a theoretical viewpoint. We first explore accuracy guarantees for training methods under convexity assumptions. We then look at the behavior of these algorithms for non-convex problems, and show that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low-precision arithmetic.

A complete overview of quantized nets, and the theory of training them, can be found in our article below.

Training Quantized Nets: A Deeper Understanding.