3D Shape Reconstruction from a Single 2D Image via 2D-3D Self-Consistency
This addresses the problem of limited 3D ground truth data for researchers in computer vision, offering a semi-supervised solution that is incremental in improving reconstruction methods.
The paper tackles 3D shape reconstruction from single 2D images by proposing a semi-supervised framework using 2D-3D self-consistency, which aligns predicted 3D models with projected 2D masks and jointly predicts camera pose without supervision, achieving favorable performance against state-of-the-art methods in both supervised and semi-supervised settings.
Aiming at inferring 3D shapes from 2D images, 3D shape reconstruction has drawn huge attention from researchers in computer vision and deep learning communities. However, it is not practical to assume that 2D input images and their associated ground truth 3D shapes are always available during training. In this paper, we propose a framework for semi-supervised 3D reconstruction. This is realized by our introduced 2D-3D self-consistency, which aligns the predicted 3D models and the projected 2D foreground segmentation masks. Moreover, our model not only enables recovering 3D shapes with the corresponding 2D masks, camera pose information can be jointly disentangled and predicted, even such supervision is never available during training. In the experiments, we qualitatively and quantitatively demonstrate the effectiveness of our model, which performs favorably against state-of-the-art approaches in either supervised or semi-supervised settings.