CVIVSep 30, 2023

Domain-Controlled Prompt Learning

arXiv:2310.07730v238 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of domain adaptation for vision-language models, which is incremental as it builds on existing prompt learning methods by adding domain-awareness mechanisms.

The paper tackles the challenge of adapting large pre-trained vision-language models to specific domains like remote sensing and medical images by introducing Domain-Controlled Prompt Learning, which uses a domain foundation model and noisy-adding strategy to achieve state-of-the-art performance on specific domain image recognition datasets.

Large pre-trained vision-language models, such as CLIP, have shown remarkable generalization capabilities across various tasks when appropriate text prompts are provided. However, adapting these models to specific domains, like remote sensing images (RSIs), medical images, etc, remains unexplored and challenging. Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms, leading to suboptimal performance due to the misinterpretation of specific images in natural image patterns. To tackle this dilemma, we proposed a \textbf{Domain-Controlled Prompt Learning} for the specific domains. Specifically, the large-scale specific domain foundation model (LSDM) is first introduced to provide essential specific domain knowledge. Using lightweight neural networks, we transfer this knowledge into domain biases, which control both the visual and language branches to obtain domain-adaptive prompts in a directly incorporating manner. Simultaneously, to overcome the existing overfitting challenge, we propose a novel noisy-adding strategy, without extra trainable parameters, to help the model escape the suboptimal solution in a global domain oscillation manner. Experimental results show our method achieves state-of-the-art performance in specific domain image recognition datasets. Our code is available at https://github.com/caoql98/DCPL.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes