Conditional Single-view Shape Generation for Multi-view Stereo Reconstruction
This addresses the challenge of ambiguous shape reconstruction from single views for computer vision applications, representing an incremental improvement over existing deterministic approaches.
The paper tackles the problem of 3D shape reconstruction from single images by modeling uncertainty in occluded parts, then extends this to multi-view reconstruction by intersecting predicted shape spaces. It outperforms state-of-the-art methods on 3D reconstruction test error and demonstrates generalization to real-world data.
In this paper, we present a new perspective towards image-based shape generation. Most existing deep learning based shape reconstruction methods employ a single-view deterministic model which is sometimes insufficient to determine a single groundtruth shape because the back part is occluded. In this work, we first introduce a conditional generative network to model the uncertainty for single-view reconstruction. Then, we formulate the task of multi-view reconstruction as taking the intersection of the predicted shape spaces on each single image. We design new differentiable guidance including the front constraint, the diversity constraint, and the consistency loss to enable effective single-view conditional generation and multi-view synthesis. Experimental results and ablation studies show that our proposed approach outperforms state-of-the-art methods on 3D reconstruction test error and demonstrate its generalization ability on real world data.