CVMar 29, 2024

DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

arXiv:2404.00086v527 citationsh-index: 11Has CodeECCV
Originality Incremental advance
AI Analysis

This addresses a common real-world issue in video segmentation for applications like surveillance or autonomous driving, but it is incremental as it builds on existing query-based methods.

The paper tackles the problem of video segmentation underperforming on newly emerging and disappearing objects by introducing Dynamic Anchor Queries (DAQ) and a query-level simulation strategy, achieving state-of-the-art performance on five benchmarks.

Modern video segmentation methods adopt object queries to perform inter-frame association and demonstrate satisfactory performance in tracking continuously appearing objects despite large-scale motion and transient occlusion. However, they all underperform on newly emerging and disappearing objects that are common in the real world because they attempt to model object emergence and disappearance through feature transitions between background and foreground queries that have significant feature gaps. We introduce Dynamic Anchor Queries (DAQ) to shorten the transition gap between the anchor and target queries by dynamically generating anchor queries based on the features of potential candidates. Furthermore, we introduce a query-level object Emergence and Disappearance Simulation (EDS) strategy, which unleashes DAQ's potential without any additional cost. Finally, we combine our proposed DAQ and EDS with DVIS to obtain DVIS-DAQ. Extensive experiments demonstrate that DVIS-DAQ achieves a new state-of-the-art (SOTA) performance on five mainstream video segmentation benchmarks. Code and models are available at \url{https://github.com/SkyworkAI/DAQ-VS}.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes