CVApr 18, 2024

Moving Object Segmentation: All You Need Is SAM (and Flow)

arXiv:2404.12389v231 citationsh-index: 49ACCV
Originality Incremental advance
AI Analysis

This provides a simple, effective solution for video analysis tasks like autonomous driving or surveillance, though it is incremental as it builds on existing models.

The paper tackled motion segmentation by combining the Segment Anything Model (SAM) with optical flow, achieving state-of-the-art performance with a considerable margin in benchmarks.

The objective of this paper is motion segmentation -- discovering and segmenting the moving objects in a video. This is a much studied area with numerous careful, and sometimes complex, approaches and training schemes including: self-supervised learning, learning from synthetic datasets, object-centric representations, amodal representations, and many more. Our interest in this paper is to determine if the Segment Anything model (SAM) can contribute to this task. We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects. In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt. These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks. We also extend these frame-level segmentations to sequence-level segmentations that maintain object identity. Again, this simple model achieves outstanding performance across multiple moving object segmentation benchmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes