LGAICVJul 29, 2024

MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity

arXiv:2407.20021v49 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of compressing vision transformers without original data, which is incremental as it builds on existing data-free quantization methods.

The paper tackles the problem of low-bit data-free quantization for vision transformers, which often fails due to misaligned attention maps in synthetic data, and proposes MimiQ to enhance inter-head attention similarity, achieving state-of-the-art performance.

Data-free quantization (DFQ) is a technique that creates a lightweight network from its full-precision counterpart without the original training data, often through a synthetic dataset. Although several DFQ methods have been proposed for vision transformer (ViT) architectures, they fail to achieve efficacy in low-bit settings. Examining the existing methods, we observe that their synthetic data produce misaligned attention maps, while those of the real samples are highly aligned. From this observation, we find that aligning attention maps of synthetic data helps improve the overall performance of quantized ViTs. Motivated by this finding, we devise MimiQ, a novel DFQ method designed for ViTs that enhances inter-head attention similarity. First, we generate synthetic data by aligning head-wise attention outputs from each spatial query patch. Then, we align the attention maps of the quantized network to those of the full-precision teacher by applying head-wise structural attention distillation. The experimental results show that the proposed method significantly outperforms baselines, setting a new state-of-the-art for ViT-DFQ. This paper is an extended version of our work published in the proceedings of AAAI 2025, including additional supplementary material.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes