CVAug 12, 2019

Point-Based Multi-View Stereo Network

arXiv:1908.04422v130.8407 citationsHas Code

Originality Highly original

AI Analysis

This work addresses 3D reconstruction for computer vision applications, offering higher accuracy and efficiency over existing cost-volume methods, though it is incremental as it builds on point cloud processing.

The paper tackles multi-view stereo reconstruction by introducing a point-based deep framework that processes scenes as point clouds, achieving significant improvement in reconstruction quality on DTU and Tanks and Temples datasets compared to state-of-the-art methods.

We introduce Point-MVSNet, a novel point-based deep framework for multi-view stereo (MVS). Distinct from existing cost volume approaches, our method directly processes the target scene as point clouds. More specifically, our method predicts the depth in a coarse-to-fine manner. We first generate a coarse depth map, convert it into a point cloud and refine the point cloud iteratively by estimating the residual between the depth of the current iteration and that of the ground truth. Our network leverages 3D geometry priors and 2D texture information jointly and effectively by fusing them into a feature-augmented point cloud, and processes the point cloud to estimate the 3D flow for each point. This point-based architecture allows higher accuracy, more computational efficiency and more flexibility than cost-volume-based counterparts. Experimental results show that our approach achieves a significant improvement in reconstruction quality compared with state-of-the-art methods on the DTU and the Tanks and Temples dataset. Our source code and trained models are available at https://github.com/callmeray/PointMVSNet .

View on arXiv PDF Code

Similar