CVNov 27, 2024

RoMo: Robust Motion Segmentation Improves Structure from Motion

arXiv:2411.18650v116 citationsh-index: 15
Originality Incremental advance
AI Analysis

This addresses the limitation in SfM pipelines for scenes with dynamic content, which is an incremental improvement over existing methods.

The paper tackles the problem of robust motion segmentation to improve structure-from-motion (SfM) camera calibration in dynamic scenes, and the result is that their method, RoMo, establishes a new state-of-the-art by outperforming existing methods by a substantial margin.

There has been extensive progress in the reconstruction and generation of 4D scenes from monocular casually-captured video. While these tasks rely heavily on known camera poses, the problem of finding such poses using structure-from-motion (SfM) often depends on robustly separating static from dynamic parts of a video. The lack of a robust solution to this problem limits the performance of SfM camera-calibration pipelines. We propose a novel approach to video-based motion segmentation to identify the components of a scene that are moving w.r.t. a fixed world frame. Our simple but effective iterative method, RoMo, combines optical flow and epipolar cues with a pre-trained video segmentation model. It outperforms unsupervised baselines for motion segmentation as well as supervised baselines trained from synthetic data. More importantly, the combination of an off-the-shelf SfM pipeline with our segmentation masks establishes a new state-of-the-art on camera calibration for scenes with dynamic content, outperforming existing methods by a substantial margin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes