CVROApr 30, 2025

CMD: Constraining Multimodal Distribution for Domain Adaptation in Stereo Matching

arXiv:2504.21302v12 citationsh-index: 11Has CodePattern Recognition
Originality Incremental advance
AI Analysis

This addresses domain adaptation issues in stereo matching for computer vision applications, but it is incremental as it builds on existing methods.

The paper tackled the problem of degraded generalization in unsupervised domain adaptation for stereo matching due to multimodal disparity distributions, and proposed CMD to encourage unimodal distributions, resulting in improved prediction accuracy across multiple networks.

Recently, learning-based stereo matching methods have achieved great improvement in public benchmarks, where soft argmin and smooth L1 loss play a core contribution to their success. However, in unsupervised domain adaptation scenarios, we observe that these two operations often yield multimodal disparity probability distributions in target domains, resulting in degraded generalization. In this paper, we propose a novel approach, Constrain Multi-modal Distribution (CMD), to address this issue. Specifically, we introduce \textit{uncertainty-regularized minimization} and \textit{anisotropic soft argmin} to encourage the network to produce predominantly unimodal disparity distributions in the target domain, thereby improving prediction accuracy. Experimentally, we apply the proposed method to multiple representative stereo-matching networks and conduct domain adaptation from synthetic data to unlabeled real-world scenes. Results consistently demonstrate improved generalization in both top-performing and domain-adaptable stereo-matching models. The code for CMD will be available at: \href{https://github.com/gallenszl/CMD}{https://github.com/gallenszl/CMD}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes