CVMar 24, 2023

Hard Sample Matters a Lot in Zero-Shot Quantization

arXiv:2303.13826v130 citationsh-index: 58
Originality Incremental advance
AI Analysis

This addresses the challenge of compressing deep neural networks efficiently for scenarios where data is inaccessible, though it is incremental as it builds on existing ZSQ approaches.

The paper tackles the problem of zero-shot quantization (ZSQ) where synthetic samples are used for network compression without access to original training data, and it finds that existing methods degrade on hard samples; the proposed HAST method significantly outperforms these methods, achieving performance comparable to models quantized with real data.

Zero-shot quantization (ZSQ) is promising for compressing and accelerating deep neural networks when the data for training full-precision models are inaccessible. In ZSQ, network quantization is performed using synthetic samples, thus, the performance of quantized models depends heavily on the quality of synthetic samples. Nonetheless, we find that the synthetic samples constructed in existing ZSQ methods can be easily fitted by models. Accordingly, quantized models obtained by these methods suffer from significant performance degradation on hard samples. To address this issue, we propose HArd sample Synthesizing and Training (HAST). Specifically, HAST pays more attention to hard samples when synthesizing samples and makes synthetic samples hard to fit when training quantized models. HAST aligns features extracted by full-precision and quantized models to ensure the similarity between features extracted by these two models. Extensive experiments show that HAST significantly outperforms existing ZSQ methods, achieving performance comparable to models that are quantized with real data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes