CVAug 17, 2017

Learning a Multi-View Stereo Machine

Abhishek Kar, Christian Häne, Jitendra Malik

arXiv:1708.05375v137.6582 citations

Originality Highly original

AI Analysis

This addresses the problem of efficient and robust 3D reconstruction for computer vision applications, representing a novel method for a known bottleneck rather than a paradigm shift.

The paper tackles 3D reconstruction from images by developing a differentiable multi-view stereo system that incorporates 3D geometry through feature projection/unprojection, enabling reconstruction from fewer images (even one) and surface completion, with evaluation on ShapeNet showing benefits over classical and learning-based methods.

We present a learnt system for multi-view stereopsis. In contrast to recent learning based methods for 3D reconstruction, we leverage the underlying 3D geometry of the problem through feature projection and unprojection along viewing rays. By formulating these operations in a differentiable manner, we are able to learn the system end-to-end for the task of metric 3D reconstruction. End-to-end learning allows us to jointly reason about shape priors while conforming geometric constraints, enabling reconstruction from much fewer images (even a single image) than required by classical approaches as well as completion of unseen surfaces. We thoroughly evaluate our approach on the ShapeNet dataset and demonstrate the benefits over classical approaches as well as recent learning based methods.

View on arXiv PDF

Similar