CV NEMar 13, 2017

End-to-End Learning of Geometry and Context for Deep Stereo Regression

Alex Kendall, Hayk Martirosyan, Saumitro Dasgupta, Peter Henry, Ryan Kennedy, Abraham Bachrach, Adam Bry

arXiv:1703.04309v138.21512 citations

Originality Highly original

AI Analysis

This work addresses stereo vision for applications like autonomous driving, with incremental improvements in accuracy and efficiency.

The paper tackles disparity regression from stereo images by introducing an end-to-end deep learning architecture that leverages geometry and contextual information, achieving state-of-the-art results on the KITTI dataset with improved speed.

We propose a novel deep learning architecture for regressing disparity from a rectified pair of stereo images. We leverage knowledge of the problem's geometry to form a cost volume using deep feature representations. We learn to incorporate contextual information using 3-D convolutions over this volume. Disparity values are regressed from the cost volume using a proposed differentiable soft argmin operation, which allows us to train our method end-to-end to sub-pixel accuracy without any additional post-processing or regularization. We evaluate our method on the Scene Flow and KITTI datasets and on KITTI we set a new state-of-the-art benchmark, while being significantly faster than competing approaches.

View on arXiv PDF

Similar