CVSep 29, 2025

DAM: Dual Active Learning with Multimodal Foundation Model for Source-Free Domain Adaptation

arXiv:2509.24896v1h-index: 5
Originality Incremental advance
AI Analysis

This work addresses domain adaptation for scenarios where source data is unavailable, improving efficiency for applications like autonomous driving or medical imaging, though it is incremental as it builds on existing active learning and multimodal methods.

The paper tackles source-free active domain adaptation by proposing DAM, a framework that integrates multimodal supervision from a Vision-and-Language model with sparse human annotations to enhance knowledge transfer, achieving state-of-the-art performance across multiple benchmarks.

Source-free active domain adaptation (SFADA) enhances knowledge transfer from a source model to an unlabeled target domain using limited manual labels selected via active learning. While recent domain adaptation studies have introduced Vision-and-Language (ViL) models to improve pseudo-label quality or feature alignment, they often treat ViL-based and data supervision as separate sources, lacking effective fusion. To overcome this limitation, we propose Dual Active learning with Multimodal (DAM) foundation model, a novel framework that integrates multimodal supervision from a ViL model to complement sparse human annotations, thereby forming a dual supervisory signal. DAM initializes stable ViL-guided targets and employs a bidirectional distillation mechanism to foster mutual knowledge exchange between the target model and the dual supervisions during iterative adaptation. Extensive experiments demonstrate that DAM consistently outperforms existing methods and sets a new state-of-the-art across multiple SFADA benchmarks and active learning strategies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes