CVApr 30, 2020

M^3VSNet: Unsupervised Multi-metric Multi-view Stereo Network

arXiv:2005.00363v284 citationsHas Code
AI Analysis

This addresses the challenge of limited labeled data in 3D reconstruction for computer vision applications, offering an incremental improvement over existing unsupervised methods.

The paper tackles the problem of dense point cloud reconstruction in multi-view stereo without requiring ground-truth depth maps for training, achieving state-of-the-art unsupervised performance comparable to supervised methods on the DTU dataset and showing improved generalization on the Tanks and Temples benchmark.

The present Multi-view stereo (MVS) methods with supervised learning-based networks have an impressive performance comparing with traditional MVS methods. However, the ground-truth depth maps for training are hard to be obtained and are within limited kinds of scenarios. In this paper, we propose a novel unsupervised multi-metric MVS network, named M^3VSNet, for dense point cloud reconstruction without any supervision. To improve the robustness and completeness of point cloud reconstruction, we propose a novel multi-metric loss function that combines pixel-wise and feature-wise loss function to learn the inherent constraints from different perspectives of matching correspondences. Besides, we also incorporate the normal-depth consistency in the 3D point cloud format to improve the accuracy and continuity of the estimated depth maps. Experimental results show that M3VSNet establishes the state-of-the-arts unsupervised method and achieves comparable performance with previous supervised MVSNet on the DTU dataset and demonstrates the powerful generalization ability on the Tanks and Temples benchmark with effective improvement. Our code is available at https://github.com/whubaichuan/M3VSNet

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes