CVMar 23

Deep S2P: Integrating Learning Based Stereo Matching Into the Satellite Stereo Pipeline

arXiv:2603.2188216.0h-index: 16
AI Analysis

This work addresses the challenge of integrating learning-based stereo matching into operational satellite pipelines for Earth observation, representing an incremental improvement with domain-specific impact.

The authors integrated modern learning-based stereo matchers into the Satellite Stereo Pipeline to improve Digital Surface Model generation from satellite imagery, achieving consistent accuracy improvements over classical methods, though metrics like mean absolute error showed saturation effects.

Digital Surface Model generation from satellite imagery is a core task in Earth observation and is commonly addressed using classical stereoscopic matching algorithms in satellite pipelines as in the Satellite Stereo Pipeline (S2P). While recent learning-based stereo matchers achieve state-of-the-art performance on standard benchmarks, their integration into operational satellite pipelines remains challenging due to differences in viewing geometry and disparity assumptions. In this work, we integrate several modern learning-based stereo matchers, including StereoAnywhere, MonSter, Foundation Stereo, and a satellite fine-tuned variant of MonSter, into the Satellite Stereo Pipeline, adapting the rectification stage to enforce compatible disparity polarity and range. We release the corresponding code to enable reproducible use of these methods in large-scale Earth observation workflows. Experiments on satellite imagery show consistent improvements over classical cost-volume-based approaches in terms of Digital Surface Model accuracy, although commonly used metrics such as mean absolute error exhibit saturation effects. Qualitative results reveal substantially improved geometric detail and sharper structures, highlighting the need for evaluation strategies that better reflect perceptual and structural fidelity. At the same time, performance over challenging surface types such as vegetation remains limited across all evaluated models, indicating open challenges for learning-based stereo in natural environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes