CVJul 8, 2024

Submodular video object proposal selection for semantic object segmentation

arXiv:2407.05913v15 citationsh-index: 27
Originality Incremental advance
AI Analysis

This work addresses video object segmentation for computer vision applications, but it appears incremental as it builds on existing methods with a novel selection process.

The paper tackled the problem of semantic video object segmentation by learning a data-driven representation to prune noisy detections, achieving superior performance compared to state-of-the-art methods on a challenging dataset.

Learning a data-driven spatio-temporal semantic representation of the objects is the key to coherent and consistent labelling in video. This paper proposes to achieve semantic video object segmentation by learning a data-driven representation which captures the synergy of multiple instances from continuous frames. To prune the noisy detections, we exploit the rich information among multiple instances and select the discriminative and representative subset. This selection process is formulated as a facility location problem solved by maximising a submodular function. Our method retrieves the longer term contextual dependencies which underpins a robust semantic video object segmentation algorithm. We present extensive experiments on a challenging dataset that demonstrate the superior performance of our approach compared with the state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes