CVLGNEOct 20, 2015

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

arXiv:1510.05970v21455 citations
Originality Highly original
AI Analysis

This work addresses depth extraction from stereo images for applications like robotics and autonomous driving, presenting a novel method for matching cost computation.

The paper tackles stereo matching by learning a similarity measure for image patches using a convolutional neural network to compute matching costs, and it outperforms other methods on KITTI 2012, KITTI 2015, and Middlebury datasets.

We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes