CVSep 2, 2025

Decoupling Bidirectional Geometric Representations of 4D cost volume with 2D convolution

arXiv:2509.02415v1h-index: 2Has Code
Originality Highly original
AI Analysis

This addresses the challenge of high-performance stereo matching for mobile devices by providing a lightweight alternative to 3D regularization.

The paper tackles the problem of real-time stereo matching by proposing DBStereo, a deployment-friendly 4D cost aggregation network based on pure 2D convolutions, which outperforms existing methods in both inference time and accuracy, even surpassing IGEV-Stereo.

High-performance real-time stereo matching methods invariably rely on 3D regularization of the cost volume, which is unfriendly to mobile devices. And 2D regularization based methods struggle in ill-posed regions. In this paper, we present a deployment-friendly 4D cost aggregation network DBStereo, which is based on pure 2D convolutions. Specifically, we first provide a thorough analysis of the decoupling characteristics of 4D cost volume. And design a lightweight bidirectional geometry aggregation block to capture spatial and disparity representation respectively. Through decoupled learning, our approach achieves real-time performance and impressive accuracy simultaneously. Extensive experiments demonstrate that our proposed DBStereo outperforms all existing aggregation-based methods in both inference time and accuracy, even surpassing the iterative-based method IGEV-Stereo. Our study break the empirical design of using 3D convolutions for 4D cost volume and provides a simple yet strong baseline of the proposed decouple aggregation paradigm for further study. Code will be available at (\href{https://github.com/happydummy/DBStereo}{https://github.com/happydummy/DBStereo}) soon.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes