LGMLMar 9, 2018

High-Accuracy Low-Precision Training

arXiv:1803.03383v1114 citations
Originality Highly original
AI Analysis

This addresses the need for efficient training on hardware accelerators, offering a practical improvement over existing low-precision methods.

The paper tackles the problem of low-precision training in machine learning, which previously suffered from accuracy loss due to quantization noise, and introduces HALP, a variant that matches full-precision convergence rates while achieving up to 4x faster runtime on CPUs.

Low-precision computation is often used to lower the time and energy cost of machine learning, and recently hardware accelerators have been developed to support it. Still, it has been used primarily for inference - not training. Previous low-precision training algorithms suffered from a fundamental tradeoff: as the number of bits of precision is lowered, quantization noise is added to the model, which limits statistical accuracy. To address this issue, we describe a simple low-precision stochastic gradient descent variant called HALP. HALP converges at the same theoretical rate as full-precision algorithms despite the noise introduced by using low precision throughout execution. The key idea is to use SVRG to reduce gradient variance, and to combine this with a novel technique called bit centering to reduce quantization error. We show that on the CPU, HALP can run up to $4 \times$ faster than full-precision SVRG and can match its convergence trajectory. We implemented HALP in TensorQuant, and show that it exceeds the validation performance of plain low-precision SGD on two deep learning tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes