CVAILGJun 19, 2025

Heterogeneous-Modal Unsupervised Domain Adaptation via Latent Space Bridging

arXiv:2506.15971v12 citationsh-index: 3
Originality Highly original
AI Analysis

This addresses a critical limitation in domain adaptation for scenarios with completely different modalities, such as transferring knowledge between images and text, which is incremental as it builds on existing UDA methods.

The paper tackles the problem of unsupervised domain adaptation when source and target domains are from entirely distinct modalities, proposing a novel setting called Heterogeneous-Modal Unsupervised Domain Adaptation (HMUDA) and a framework called Latent Space Bridging (LSB) for semantic segmentation, achieving state-of-the-art performance on six benchmark datasets.

Unsupervised domain adaptation (UDA) methods effectively bridge domain gaps but become struggled when the source and target domains belong to entirely distinct modalities. To address this limitation, we propose a novel setting called Heterogeneous-Modal Unsupervised Domain Adaptation (HMUDA), which enables knowledge transfer between completely different modalities by leveraging a bridge domain containing unlabeled samples from both modalities. To learn under the HMUDA setting, we propose Latent Space Bridging (LSB), a specialized framework designed for the semantic segmentation task. Specifically, LSB utilizes a dual-branch architecture, incorporating a feature consistency loss to align representations across modalities and a domain alignment loss to reduce discrepancies between class centroids across domains. Extensive experiments conducted on six benchmark datasets demonstrate that LSB achieves state-of-the-art performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes