CVSep 25, 2024

PTQ4RIS: Post-Training Quantization for Referring Image Segmentation

Xiaoyan Jiang, Hang Yang, Kaiying Zhu, Xihe Qiu, Shibo Zhao, Sifan Zhou

arXiv:2409.17020v27.63 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This work addresses the practical problem of enabling efficient RIS inference on edge devices, representing an incremental improvement by adapting quantization techniques specifically for RIS.

The authors tackled the challenge of deploying Referring Image Segmentation (RIS) models on resource-limited edge devices by proposing PTQ4RIS, a post-training quantization framework that achieved superior performance across three benchmarks with bit settings from 8 to 4 bits.

Referring Image Segmentation (RIS), aims to segment the object referred by a given sentence in an image by understanding both visual and linguistic information. However, existing RIS methods tend to explore top-performance models, disregarding considerations for practical applications on resources-limited edge devices. This oversight poses a significant challenge for on-device RIS inference. To this end, we propose an effective and efficient post-training quantization framework termed PTQ4RIS. Specifically, we first conduct an in-depth analysis of the root causes of performance degradation in RIS model quantization and propose dual-region quantization (DRQ) and reorder-based outlier-retained quantization (RORQ) to address the quantization difficulties in visual and text encoders. Extensive experiments on three benchmarks with different bits settings (from 8 to 4 bits) demonstrates its superior performance. Importantly, we are the first PTQ method specifically designed for the RIS task, highlighting the feasibility of PTQ in RIS applications. Code and video are available at {https://github.com/gugu511yy/PTQ4RIS}.

View on arXiv PDF Code

Similar