CVMar 14, 2018

EdgeStereo: A Context Integrated Residual Pyramid Network for Stereo Matching

arXiv:1803.05196v3203 citations
Originality Incremental advance
AI Analysis

This addresses accuracy issues in stereo vision for applications like autonomous driving, though it is incremental as it builds on existing CNN-based methods.

The paper tackles the problem of stereo matching in regions with non-textures, boundaries, and tiny details by proposing EdgeStereo, a multi-task network that jointly predicts disparity and edge maps, achieving state-of-the-art performance on KITTI Stereo and Scene Flow benchmarks.

Recent convolutional neural networks, especially end-to-end disparity estimation models, achieve remarkable performance on stereo matching task. However, existed methods, even with the complicated cascade structure, may fail in the regions of non-textures, boundaries and tiny details. Focus on these problems, we propose a multi-task network EdgeStereo that is composed of a backbone disparity network and an edge sub-network. Given a binocular image pair, our model enables end-to-end prediction of both disparity map and edge map. Basically, we design a context pyramid to encode multi-scale context information in disparity branch, followed by a compact residual pyramid for cascaded refinement. To further preserve subtle details, our EdgeStereo model integrates edge cues by feature embedding and edge-aware smoothness loss regularization. Comparative results demonstrates that stereo matching and edge detection can help each other in the unified model. Furthermore, our method achieves state-of-art performance on both KITTI Stereo and Scene Flow benchmarks, which proves the effectiveness of our design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes