LGCVMLApr 22, 2020

Up or Down? Adaptive Rounding for Post-Training Quantization

arXiv:2004.10568v2846 citations
Originality Highly original
AI Analysis

This addresses the challenge of efficient neural network deployment for practitioners by providing a fast, data-adaptive quantization method that significantly improves accuracy over standard rounding, though it is incremental as it builds on existing post-training quantization frameworks.

The paper tackles the problem of weight rounding in post-training quantization of neural networks by proposing AdaRound, an adaptive rounding mechanism that uses data and task loss to improve accuracy without fine-tuning, achieving less than 1% accuracy loss when quantizing ResNet18 and ResNet50 to 4 bits.

When quantizing neural networks, assigning each floating-point weight to its nearest fixed-point value is the predominant approach. We find that, perhaps surprisingly, this is not the best we can do. In this paper, we propose AdaRound, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss. AdaRound is fast, does not require fine-tuning of the network, and only uses a small amount of unlabelled data. We start by theoretically analyzing the rounding problem for a pre-trained neural network. By approximating the task loss with a Taylor series expansion, the rounding task is posed as a quadratic unconstrained binary optimization problem. We simplify this to a layer-wise local loss and propose to optimize this loss with a soft relaxation. AdaRound not only outperforms rounding-to-nearest by a significant margin but also establishes a new state-of-the-art for post-training quantization on several networks and tasks. Without fine-tuning, we can quantize the weights of Resnet18 and Resnet50 to 4 bits while staying within an accuracy loss of 1%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes