Digging Into Normal Incorporated Stereo Matching
This work addresses a bottleneck in stereo matching for computer vision applications, but it is incremental as it builds on existing learning-based methods with specific geometric enhancements.
The paper tackled the problem of disparity estimation in challenging regions like low-texture, occluded, and bordered areas in stereo matching by incorporating geometric guidance from normal maps, resulting in a method that ranked 1st on the KITTI 2015 dataset and 3rd on the Scene Flow dataset.
Despite the remarkable progress facilitated by learning-based stereo-matching algorithms, disparity estimation in low-texture, occluded, and bordered regions still remains a bottleneck that limits the performance. To tackle these challenges, geometric guidance like plane information is necessary as it provides intuitive guidance about disparity consistency and affinity similarity. In this paper, we propose a normal incorporated joint learning framework consisting of two specific modules named non-local disparity propagation(NDP) and affinity-aware residual learning(ARL). The estimated normal map is first utilized for calculating a non-local affinity matrix and a non-local offset to perform spatial propagation at the disparity level. To enhance geometric consistency, especially in low-texture regions, the estimated normal map is then leveraged to calculate a local affinity matrix, providing the residual learning with information about where the correction should refer and thus improving the residual learning efficiency. Extensive experiments on several public datasets including Scene Flow, KITTI 2015, and Middlebury 2014 validate the effectiveness of our proposed method. By the time we finished this work, our approach ranked 1st for stereo matching across foreground pixels on the KITTI 2015 dataset and 3rd on the Scene Flow dataset among all the published works.