CVApr 22, 2024

RESfM: Robust Deep Equivariant Structure from Motion

arXiv:2404.14280v21 citationsh-index: 52ICLR
Originality Incremental advance
AI Analysis

This addresses a fundamental computer vision challenge for 3D reconstruction from images, providing a robust solution for practical applications with noisy data.

The paper tackles the problem of robust multiview Structure from Motion in realistic settings with outlier-contaminated point tracks, achieving state-of-the-art accuracies superior to deep-based methods and competitive with classical approaches.

Multiview Structure from Motion is a fundamental and challenging computer vision problem. A recent deep-based approach utilized matrix equivariant architectures for simultaneous recovery of camera pose and 3D scene structure from large image collections. That work, however, made the unrealistic assumption that the point tracks given as input are almost clean of outliers. Here, we propose an architecture suited to dealing with outliers by adding a multiview inlier/outlier classification module that respects the model equivariance and by utilizing a robust bundle adjustment step. Experiments demonstrate that our method can be applied successfully in realistic settings that include large image collections and point tracks extracted with common heuristics that include many outliers, achieving state-of-the-art accuracies in almost all runs, superior to existing deep-based methods and on-par with leading classical (non-deep) sequential and global methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes