CVAIJun 11, 2025

Q-SAM2: Accurate Quantization for Segment Anything Model 2

IBM
arXiv:2506.09782v11 citationsh-index: 21
Originality Incremental advance
AI Analysis

This work addresses the problem of deploying SAM2 in resource-constrained scenarios, offering an incremental improvement over existing quantization methods.

The paper tackles the high computational and memory costs of Segment Anything Model 2 (SAM2) by proposing Q-SAM2, a low-bit quantization method that improves efficiency while maintaining accuracy, achieving up to a 66% mIoU accuracy improvement in post-training quantization.

The Segment Anything Model 2 (SAM2) has gained significant attention as a foundational approach for promptable image and video segmentation. However, its expensive computational and memory consumption poses a severe challenge for its application in resource-constrained scenarios. In this paper, we propose an accurate low-bit quantization method for efficient SAM2, termed Q-SAM2. To address the performance degradation caused by the singularities in weight and activation distributions during quantization, Q-SAM2 introduces two novel technical contributions. We first introduce a linear layer calibration method for low-bit initialization of SAM2, which minimizes the Frobenius norm over a small image batch to reposition weight distributions for improved quantization. We then propose a Quantization-Aware Training (QAT) pipeline that applies clipping to suppress outliers and allows the network to adapt to quantization thresholds during training. Our comprehensive experiments demonstrate that Q-SAM2 allows for highly accurate inference while substantially improving efficiency. Both quantitative and visual results show that our Q-SAM2 surpasses existing state-of-the-art general quantization schemes, especially for ultra-low 2-bit quantization. While designed for quantization-aware training, our proposed calibration technique also proves effective in post-training quantization, achieving up to a 66% mIoU accuracy improvement over non-calibrated models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes