Adaptive Learning for Multi-view Stereo Reconstruction
This addresses the specific problem of improving depth prediction accuracy in 3D reconstruction for computer vision applications, representing an incremental advance.
The paper tackles the problem of inaccurate depth predictions in multi-view stereo reconstruction by proposing an adaptive Wasserstein loss function and offset module, achieving state-of-the-art performance on benchmarks like DTU, Tanks and Temples, and BlendedMVS.
Deep learning has recently demonstrated its excellent performance on the task of multi-view stereo (MVS). However, loss functions applied for deep MVS are rarely studied. In this paper, we first analyze existing loss functions' properties for deep depth based MVS approaches. Regression based loss leads to inaccurate continuous results by computing mathematical expectation, while classification based loss outputs discretized depth values. To this end, we then propose a novel loss function, named adaptive Wasserstein loss, which is able to narrow down the difference between the true and predicted probability distributions of depth. Besides, a simple but effective offset module is introduced to better achieve sub-pixel prediction accuracy. Extensive experiments on different benchmarks, including DTU, Tanks and Temples and BlendedMVS, show that the proposed method with the adaptive Wasserstein loss and the offset module achieves state-of-the-art performance.