CV AIFeb 26

AMLRIS: Alignment-aware Masked Learning for Referring Image Segmentation

Tongfei Chen, Shuo Yang, Yuguang Yang, Linlin Yang, Runtang Guo, Changbai Li, He Long, Chunyu Xie, Dawei Leng, Baochang Zhang

arXiv:2602.22740v11.5h-index: 5

Originality Incremental advance

AI Analysis

This paper provides an incremental improvement for researchers working on referring image segmentation.

This paper addresses Referring Image Segmentation (RIS) by introducing Alignment-Aware Masked Learning (AML), a training strategy that estimates pixel-level vision-language alignment and filters out poorly aligned regions. This method achieves state-of-the-art performance on RefCOCO datasets and improves robustness.

Referring Image Segmentation (RIS) aims to segment an object in an image identified by a natural language expression. The paper introduces Alignment-Aware Masked Learning (AML), a training strategy to enhance RIS by explicitly estimating pixel-level vision-language alignment, filtering out poorly aligned regions during optimization, and focusing on trustworthy cues. This approach results in state-of-the-art performance on RefCOCO datasets and also enhances robustness to diverse descriptions and scenarios

View on arXiv PDF

Similar