CVAug 16, 2019

3D Rigid Motion Segmentation with Mixed and Unknown Number of Models

arXiv:1908.06087v19.036 citations

Originality Incremental advance

AI Analysis

This work addresses motion segmentation for computer vision applications, offering an incremental improvement by integrating multiple models and handling unknown model counts.

The paper tackles the problem of 3D rigid motion segmentation in video sequences by proposing a multi-model spectral clustering framework that combines homography and fundamental matrix models, along with model selection criteria for unknown numbers of moving objects, achieving state-of-the-art performance on existing datasets and introducing a more challenging dataset from KITTI.

Many real-world video sequences cannot be conveniently categorized as general or degenerate; in such cases, imposing a false dichotomy in using the fundamental matrix or homography model for motion segmentation on video sequences would lead to difficulty. Even when we are confronted with a general scene-motion, the fundamental matrix approach as a model for motion segmentation still suffers from several defects, which we discuss in this paper. The full potential of the fundamental matrix approach could only be realized if we judiciously harness information from the simpler homography model. From these considerations, we propose a multi-model spectral clustering framework that synergistically combines multiple models (homography and fundamental matrix) together. We show that the performance can be substantially improved in this way. For general motion segmentation tasks, the number of independently moving objects is often unknown a priori and needs to be estimated from the observations. This is referred to as model selection and it is essentially still an open research problem. In this work, we propose a set of model selection criteria balancing data fidelity and model complexity. We perform extensive testing on existing motion segmentation datasets with both segmentation and model selection tasks, achieving state-of-the-art performance on all of them; we also put forth a more realistic and challenging dataset adapted from the KITTI benchmark, containing real-world effects such as strong perspectives and strong forward translations not seen in the traditional datasets.

View on arXiv PDF

Similar