CVLGNEFeb 11, 2024

Outlier-Aware Training for Low-Bit Quantization of Structural Re-Parameterized Networks

arXiv:2402.07200v1h-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of deploying efficient lightweight CNNs on resource-constrained devices by improving quantization for a specific network paradigm, representing an incremental advancement in model compression techniques.

The paper tackles the problem of quantizing structural re-parameterized networks, which suffer from outlier weights that hinder low-bit quantization, by proposing Outlier Aware Batch Normalization and a clustering-based quantization framework. The result is a significant enhancement in quantized performance for RepVGG, especially at bitwidths below 8.

Lightweight design of Convolutional Neural Networks (CNNs) requires co-design efforts in the model architectures and compression techniques. As a novel design paradigm that separates training and inference, a structural re-parameterized (SR) network such as the representative RepVGG revitalizes the simple VGG-like network with a high accuracy comparable to advanced and often more complicated networks. However, the merging process in SR networks introduces outliers into weights, making their distribution distinct from conventional networks and thus heightening difficulties in quantization. To address this, we propose an operator-level improvement for training called Outlier Aware Batch Normalization (OABN). Additionally, to meet the demands of limited bitwidths while upkeeping the inference accuracy, we develop a clustering-based non-uniform quantization framework for Quantization-Aware Training (QAT) named ClusterQAT. Integrating OABN with ClusterQAT, the quantized performance of RepVGG is largely enhanced, particularly when the bitwidth falls below 8.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes