Deep Stereo Matching with Dense CRF Priors
This work addresses the problem of improving stereo matching accuracy for computer vision applications, representing an incremental advance by incorporating CRF regularization into a deep learning framework.
The paper tackles stereo reconstruction by proposing an end-to-end deep network that integrates a dense Conditional Random Field as a prior for regularization, outperforming alternative end-to-end methods and competing with hand-engineered approaches on synthetic and real-world datasets.
Stereo reconstruction from rectified images has recently been revisited within the context of deep learning. Using a deep Convolutional Neural Network to obtain patch-wise matching cost volumes has resulted in state of the art stereo reconstruction on classic datasets like Middlebury and Kitti. By introducing this cost into a classical stereo pipeline, the final results are improved dramatically over non-learning based cost models. However these pipelines typically include hand engineered post processing steps to effectively regularize and clean the result. Here, we show that it is possible to take a more holistic approach by training a fully end-to-end network which directly includes regularization in the form of a densely connected Conditional Random Field (CRF) that acts as a prior on inter-pixel interactions. We demonstrate that our approach on both synthetic and real world datasets outperforms an alternative end-to-end network and compares favorably to more hand engineered approaches.