Non-local Recurrent Regularization Networks for Multi-view Stereo
This addresses the problem of accurate depth estimation in 3D reconstruction for computer vision applications, representing a novel method for a known bottleneck.
The paper tackles the limitation of existing recurrent methods in multi-view stereo that only model local dependencies in the depth domain, proposing NR2-Net with a depth attention module and gated recurrent modeling to capture non-local depth interactions and global scene context. The method achieves state-of-the-art reconstruction results on DTU and Tanks and Temples datasets.
In deep multi-view stereo networks, cost regularization is crucial to achieve accurate depth estimation. Since 3D cost volume filtering is usually memory-consuming, recurrent 2D cost map regularization has recently become popular and has shown great potential in reconstructing 3D models of different scales. However, existing recurrent methods only model the local dependencies in the depth domain, which greatly limits the capability of capturing the global scene context along the depth dimension. To tackle this limitation, we propose a novel non-local recurrent regularization network for multi-view stereo, named NR2-Net. Specifically, we design a depth attention module to capture non-local depth interactions within a sliding depth block. Then, the global scene context between different blocks is modeled in a gated recurrent manner. This way, the long-range dependencies along the depth dimension are captured to facilitate the cost regularization. Moreover, we design a dynamic depth map fusion strategy to improve the algorithm robustness. Our method achieves state-of-the-art reconstruction results on both DTU and Tanks and Temples datasets.