MeshMVS: Multi-View Stereo Guided Mesh Reconstruction
This work addresses the challenge of improving 3D shape reconstruction accuracy for computer vision applications, representing an incremental advance over existing multi-view generation techniques.
The paper tackles the problem of generating accurate 3D meshes from multi-view images by explicitly incorporating geometry information from multi-view stereo depth features, resulting in a 34% decrease in Chamfer distance and 14% increase in F1-score on ShapeNet compared to state-of-the-art methods.
Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects and guide the shape generation process. These color image semantics only implicitly encode 3D information, potentially limiting the accuracy of the generated shapes. In this paper we propose a multi-view mesh generation method which incorporates geometry information explicitly by using the features from intermediate depth representations of multi-view stereo and regularizing the 3D shapes against these depth images. First, our system predicts a coarse 3D volume from the color images by probabilistically merging voxel occupancy grids from the prediction of individual views. Then the depth images from multi-view stereo along with the rendered depth images of the coarse shape are used as a contrastive input whose features guide the refinement of the coarse shape through a series of graph convolution networks. Notably, we achieve superior results than state-of-the-art multi-view shape generation methods with 34% decrease in Chamfer distance to ground truth and 14% increase in F1-score on ShapeNet dataset.Our source code is available at https://git.io/Jmalg