CVNov 29, 2022

SparsePose: Sparse-View Camera Pose Regression and Refinement

Stanford
arXiv:2211.16991v159 citationsh-index: 45
Originality Incremental advance
AI Analysis

This addresses the problem of time-consuming or impractical dense image capture for 3D reconstruction, offering a solution for sparse-view scenarios, though it is incremental as it builds on learning-based methods.

The paper tackles camera pose estimation from sparse image sets (fewer than 10 images), proposing SparsePose to regress and refine poses, which significantly outperforms baselines in accuracy and enables high-fidelity 3D reconstruction with only 5-9 images.

Camera pose estimation is a key step in standard 3D reconstruction pipelines that operate on a dense set of images of a single object or scene. However, methods for pose estimation often fail when only a few images are available because they rely on the ability to robustly identify and match visual features between image pairs. While these methods can work robustly with dense camera views, capturing a large set of images can be time-consuming or impractical. We propose SparsePose for recovering accurate camera poses given a sparse set of wide-baseline images (fewer than 10). The method learns to regress initial camera poses and then iteratively refine them after training on a large-scale dataset of objects (Co3D: Common Objects in 3D). SparsePose significantly outperforms conventional and learning-based baselines in recovering accurate camera rotations and translations. We also demonstrate our pipeline for high-fidelity 3D reconstruction using only 5-9 images of an object.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes