CVSep 23, 2025

The 1st Solution for MOSEv2 Challenge 2025: Long-term and Concept-aware Video Segmentation via SeC

arXiv:2509.19183v1h-index: 7
Originality Synthesis-oriented
AI Analysis

This work addresses video object segmentation for computer vision researchers, but it is incremental as it adapts an existing framework to a specific challenge.

The paper tackles complex semi-supervised video object segmentation in the MOSEv2 challenge by adapting SeC, an enhanced SAM-2 framework, to leverage long-term memory for temporal continuity and concept-aware memory for semantic priors, achieving a first-place ranking with a JF score of 39.89% on the test set.

This technical report explores the MOSEv2 track of the LSVOS Challenge, which targets complex semi-supervised video object segmentation. By analysing and adapting SeC, an enhanced SAM-2 framework, we conduct a detailed study of its long-term memory and concept-aware memory, showing that long-term memory preserves temporal continuity under occlusion and reappearance, while concept-aware memory supplies semantic priors that suppress distractors; together, these traits directly benefit several MOSEv2's core challenges. Our solution achieves a JF score of 39.89% on the test set, ranking 1st in the MOSEv2 track of the LSVOS Challenge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes