LGAICVMar 31, 2021

Bit-Mixer: Mixed-precision networks with runtime bit-width selection

arXiv:2103.17267v131 citations
Originality Highly original
AI Analysis

This addresses the need for adaptable quantization in neural networks for dynamic on-device deployment, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of mixed-precision networks requiring predefined bit-widths during training, which limits runtime flexibility for on-device deployment, and proposes Bit-Mixer, a method that enables runtime bit-width selection without compromising accuracy.

Mixed-precision networks allow for a variable bit-width quantization for every layer in the network. A major limitation of existing work is that the bit-width for each layer must be predefined during training time. This allows little flexibility if the characteristics of the device on which the network is deployed change during runtime. In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference. To this end, we make 2 key contributions: (a) Transitional Batch-Norms, and (b) a 3-stage optimization process which is shown capable of training such a network. We show that our method can result in mixed precision networks that exhibit the desirable flexibility properties for on-device deployment without compromising accuracy. Code will be made available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes