Co-Teaching: An Ark to Unsupervised Stereo Matching
This work addresses a specific bottleneck (occlusions) in unsupervised stereo matching for autonomous driving, representing an incremental improvement over existing methods.
The paper tackles the problem of poor performance near occlusions in unsupervised stereo matching for autonomous driving perception by proposing CoT-Stereo, a co-teaching framework where two networks interactively teach each other about occlusions, resulting in superior performance over state-of-the-art unsupervised approaches on KITTI Stereo benchmarks in terms of accuracy and speed.
Stereo matching is a key component of autonomous driving perception. Recent unsupervised stereo matching approaches have received adequate attention due to their advantage of not requiring disparity ground truth. These approaches, however, perform poorly near occlusions. To overcome this drawback, in this paper, we propose CoT-Stereo, a novel unsupervised stereo matching approach. Specifically, we adopt a co-teaching framework where two networks interactively teach each other about the occlusions in an unsupervised fashion, which greatly improves the robustness of unsupervised stereo matching. Extensive experiments on the KITTI Stereo benchmarks demonstrate the superior performance of CoT-Stereo over all other state-of-the-art unsupervised stereo matching approaches in terms of both accuracy and speed. Our project webpage is https://sites.google.com/view/cot-stereo.