CVLGMay 6, 2024

PTQ4SAM: Post-Training Quantization for Segment Anything

arXiv:2405.03144v141 citationsHas CodeCVPR
Originality Incremental advance
AI Analysis

This work addresses deployment challenges for large-scale vision models like SAM, making them more efficient for practical use, though it is incremental as it builds on existing quantization methods.

The authors tackled the high memory and computation costs of the Segment Anything Model (SAM) by proposing PTQ4SAM, a post-training quantization framework that achieves lossless accuracy for instance segmentation at 6-bit quantization with about 0.5% drop and theoretical 3.9x acceleration.

Segment Anything Model (SAM) has achieved impressive performance in many computer vision tasks. However, as a large-scale model, the immense memory and computation costs hinder its practical deployment. In this paper, we propose a post-training quantization (PTQ) framework for Segment Anything Model, namely PTQ4SAM. First, we investigate the inherent bottleneck of SAM quantization attributed to the bimodal distribution in post-Key-Linear activations. We analyze its characteristics from both per-tensor and per-channel perspectives, and propose a Bimodal Integration strategy, which utilizes a mathematically equivalent sign operation to transform the bimodal distribution into a relatively easy-quantized normal distribution offline. Second, SAM encompasses diverse attention mechanisms (i.e., self-attention and two-way cross-attention), resulting in substantial variations in the post-Softmax distributions. Therefore, we introduce an Adaptive Granularity Quantization for Softmax through searching the optimal power-of-two base, which is hardware-friendly. Extensive experimental results across various vision tasks (instance segmentation, semantic segmentation and object detection), datasets and model variants show the superiority of PTQ4SAM. For example, when quantizing SAM-L to 6-bit, we achieve lossless accuracy for instance segmentation, about 0.5\% drop with theoretical 3.9$\times$ acceleration. The code is available at \url{https://github.com/chengtao-lv/PTQ4SAM}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes