CVMar 31

EarthBridge: A Solution for 4th Multi-modal Aerial View Image Challenge Translation Track

arXiv:2603.0675342.9h-index: 4Has Code
AI Analysis

This work addresses the problem of multi-modal aerial image analysis for remote sensing applications, but it is incremental as it builds on existing translation methods for a specific challenge.

The paper tackles cross-modal image-to-image translation among aerial sensors (EO, IR, SAR) by proposing EarthBridge, a framework using DBIM and CUT methods, achieving a composite score of 0.38 and second place in the MAVIC-T challenge.

Cross-modal image-to-image translation among Electro-Optical (EO), Infrared (IR), and Synthetic Aperture Radar (SAR) sensors is essential for comprehensive multi-modal aerial-view analysis. However, translating between these modalities is notoriously difficult due to their distinct electromagnetic signatures and geometric characteristics. This paper presents \textbf{EarthBridge}, a high-fidelity translation framework developed for the 4th Multi-modal Aerial View Image Challenge -- Translation (MAVIC-T). We explore two distinct methodologies: \textbf{Diffusion Bridge Implicit Models (DBIM)}, which we generalize using non-Markovian bridge processes for high-quality deterministic sampling, and \textbf{Contrastive Unpaired Translation (CUT)}, which utilizes contrastive learning for structural consistency. Our EarthBridge framework employs a channel-concatenated UNet denoiser trained with Karras-weighted bridge scalings and a specialized "booting noise" initialization to handle the inherent ambiguity in cross-modal mappings. We evaluate these methods across all four challenge tasks (SAR$\rightarrow$EO, SAR$\rightarrow$RGB, SAR$\rightarrow$IR, RGB$\rightarrow$IR), achieving superior spatial detail and spectral accuracy. Our solution achieved a composite score of 0.38, securing the second position on the MAVIC-T leaderboard. Code is available at https://github.com/Bili-Sakura/EarthBridge-Preview.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes