Detail-aware multi-view stereo network for depth estimation
This work addresses a specific challenge in 3D reconstruction for computer vision applications, representing an incremental improvement over existing coarse-to-fine frameworks.
The paper tackles the problem of poor depth estimation at object boundaries and detail regions in multi-view stereo methods by proposing a detail-aware network (DA-MVSNet) that uses geometric depth clues and an image synthesis loss, achieving competitive results on DTU and Tanks & Temples datasets.
Multi-view stereo methods have achieved great success for depth estimation based on the coarse-to-fine depth learning frameworks, however, the existing methods perform poorly in recovering the depth of object boundaries and detail regions. To address these issues, we propose a detail-aware multi-view stereo network (DA-MVSNet) with a coarse-to-fine framework. The geometric depth clues hidden in the coarse stage are utilized to maintain the geometric structural relationships between object surfaces and enhance the expressive capability of image features. In addition, an image synthesis loss is employed to constrain the gradient flow for detailed regions and further strengthen the supervision of object boundaries and texture-rich areas. Finally, we propose an adaptive depth interval adjustment strategy to improve the accuracy of object reconstruction. Extensive experiments on the DTU and Tanks & Temples datasets demonstrate that our method achieves competitive results. The code is available at https://github.com/wsmtht520-/DAMVSNet.