CVAIROMar 2, 2025

Bridging Spectral-wise and Multi-spectral Depth Estimation via Geometry-guided Contrastive Learning

arXiv:2503.00793v14 citationsh-index: 8Has CodeICRA
Originality Incremental advance
AI Analysis

This addresses the problem of robust depth estimation for autonomous vehicles using multi-modal sensors, though it appears incremental as it builds on existing multi-spectral and fusion methods.

The paper tackles depth estimation from multi-spectral images by proposing an align-and-fuse strategy with geometry-guided contrastive learning, enabling a single network to achieve spectral-invariant and fused depth estimation with improved reliability, memory efficiency, and flexibility.

Deploying depth estimation networks in the real world requires high-level robustness against various adverse conditions to ensure safe and reliable autonomy. For this purpose, many autonomous vehicles employ multi-modal sensor systems, including an RGB camera, NIR camera, thermal camera, LiDAR, or Radar. They mainly adopt two strategies to use multiple sensors: modality-wise and multi-modal fused inference. The former method is flexible but memory-inefficient, unreliable, and vulnerable. Multi-modal fusion can provide high-level reliability, yet it needs a specialized architecture. In this paper, we propose an effective solution, named align-and-fuse strategy, for the depth estimation from multi-spectral images. In the align stage, we align embedding spaces between multiple spectrum bands to learn shareable representation across multi-spectral images by minimizing contrastive loss of global and spatially aligned local features with geometry cue. After that, in the fuse stage, we train an attachable feature fusion module that can selectively aggregate the multi-spectral features for reliable and robust prediction results. Based on the proposed method, a single-depth network can achieve both spectral-invariant and multi-spectral fused depth estimation while preserving reliability, memory efficiency, and flexibility.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes