CVJul 3, 2024

CAVIS: Context-Aware Video Instance Segmentation

arXiv:2407.03010v27 citationsh-index: 4
AI Analysis

This addresses the challenge of accurate object tracking in videos for computer vision applications, representing a strong incremental improvement in the field.

The paper tackles the problem of improving instance association in video instance segmentation by integrating contextual information around objects, resulting in superior performance over state-of-the-art methods on benchmark datasets, with notable gains on the challenging OVIS dataset.

In this paper, we introduce the Context-Aware Video Instance Segmentation (CAVIS), a novel framework designed to enhance instance association by integrating contextual information adjacent to each object. To efficiently extract and leverage this information, we propose the Context-Aware Instance Tracker (CAIT), which merges contextual data surrounding the instances with the core instance features to improve tracking accuracy. Additionally, we design the Prototypical Cross-frame Contrastive (PCC) loss, which ensures consistency in object-level features across frames, thereby significantly enhancing matching accuracy. CAVIS demonstrates superior performance over state-of-the-art methods on all benchmark datasets in video instance segmentation (VIS) and video panoptic segmentation (VPS). Notably, our method excels on the OVIS dataset, known for its particularly challenging videos. Project page: https://seung-hun-lee.github.io/projects/CAVIS/

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes