Cost Volume Pyramid Network with Multi-strategies Range Searching for Multi-view Stereo
This addresses multi-view stereo reconstruction for computer vision applications, representing an incremental improvement over existing cost volume pyramid methods.
The paper tackles the problem of multi-view stereo by proposing a cost volume pyramid network with different depth range sampling strategies and adaptive unimodal filtering for each stage, achieving more accurate depth estimation. Results show it outperforms most state-of-the-art methods on DTU and BlendedMVS datasets.
Multi-view stereo is an important research task in computer vision while still keeping challenging. In recent years, deep learning-based methods have shown superior performance on this task. Cost volume pyramid network-based methods which progressively refine depth map in coarse-to-fine manner, have yielded promising results while consuming less memory. However, these methods fail to take fully consideration of the characteristics of the cost volumes in each stage, leading to adopt similar range search strategies for each cost volume stage. In this work, we present a novel cost volume pyramid based network with different searching strategies for multi-view stereo. By choosing different depth range sampling strategies and applying adaptive unimodal filtering, we are able to obtain more accurate depth estimation in low resolution stages and iteratively upsample depth map to arbitrary resolution. We conducted extensive experiments on both DTU and BlendedMVS datasets, and results show that our method outperforms most state-of-the-art methods.