CVOct 22, 2025

Adaptive Distribution-aware Quantization for Mixed-Precision Neural Networks

arXiv:2510.19760v1h-index: 5
Originality Highly original
AI Analysis

This work addresses the problem of efficient neural network deployment on resource-constrained devices, offering a novel quantization method with significant performance gains, though it is incremental in improving existing quantization techniques.

The paper tackled the challenges of non-uniform activation distributions and static weight codebooks in quantization-aware training by proposing Adaptive Distribution-aware Quantization (ADQ), a mixed-precision framework that achieved 71.512% Top-1 accuracy on ImageNet with ResNet-18 at an average bit-width of 2.81 bits, outperforming state-of-the-art methods.

Quantization-Aware Training (QAT) is a critical technique for deploying deep neural networks on resource-constrained devices. However, existing methods often face two major challenges: the highly non-uniform distribution of activations and the static, mismatched codebooks used in weight quantization. To address these challenges, we propose Adaptive Distribution-aware Quantization (ADQ), a mixed-precision quantization framework that employs a differentiated strategy. The core of ADQ is a novel adaptive weight quantization scheme comprising three key innovations: (1) a quantile-based initialization method that constructs a codebook closely aligned with the initial weight distribution; (2) an online codebook adaptation mechanism based on Exponential Moving Average (EMA) to dynamically track distributional shifts; and (3) a sensitivity-informed strategy for mixed-precision allocation. For activations, we integrate a hardware-friendly non-uniform-to-uniform mapping scheme. Comprehensive experiments validate the effectiveness of our method. On ImageNet, ADQ enables a ResNet-18 to achieve 71.512% Top-1 accuracy with an average bit-width of only 2.81 bits, outperforming state-of-the-art methods under comparable conditions. Furthermore, detailed ablation studies on CIFAR-10 systematically demonstrate the individual contributions of each innovative component, validating the rationale and effectiveness of our design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes