Quantization Error as a Metric for Dynamic Precision Scaling in Neural Net Training
This work addresses computational efficiency for neural network training, but it is incremental as it builds on prior reduced precision methods.
The paper tackles the problem of reducing computational cost in neural network training by introducing a dynamic precision scaling scheme that uses quantization error as a metric, achieving 98.8% test accuracy on MNIST with average bit-widths of 16 bits for weights and 14 bits for activations.
Recent work has explored reduced numerical precision for parameters, activations, and gradients during neural network training as a way to reduce the computational cost of training (Na & Mukhopadhyay, 2016) (Courbariaux et al., 2014). We present a novel dynamic precision scaling (DPS) scheme. Using stochastic fixed-point rounding, a quantization-error based scaling scheme, and dynamic bit-widths during training, we achieve 98.8% test accuracy on the MNIST dataset using an average bit-width of just 16 bits for weights and 14 bits for activations, compared to the standard 32-bit floating point values used in deep learning frameworks.