FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction
This addresses a key challenge in computer vision for applications like robotics or AR/VR, though it appears incremental as it builds on existing methods with specific improvements.
The paper tackles the problem of reconstructing accurate 3D object models from a few images with noisy camera poses, achieving best-in-class results on ShapeNet and being two orders of magnitude faster than a recent optimization-based approach.
Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision. State-of-the-art approaches typically assume accurate camera poses as input, which could be difficult to obtain in realistic settings. In this paper, we present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses. The core of our approach is a fast and robust multi-view reconstruction algorithm to jointly refine 3D geometry and camera pose estimation using learnable neural network modules. We provide a thorough benchmark of state-of-the-art approaches for this problem on ShapeNet. Our approach achieves best-in-class results. It is also two orders of magnitude faster than the recent optimization-based approach IDR. Our code is released at \url{https://github.com/zhenpeiyang/FvOR/}