CVLGApr 29, 2024

Transitive Vision-Language Prompt Learning for Domain Generalization

arXiv:2404.18758v16 citationsh-index: 14IEEE Trans Emerg Top Comput Intell
Originality Incremental advance
AI Analysis

This addresses domain generalization for vision-language models, but it appears incremental as it builds on existing prompt learning methods.

The paper tackles the trade-off between domain invariance and class separability in domain generalization by introducing a novel prompt learning strategy with deep vision and language prompts, achieving state-of-the-art performance on three datasets.

The vision-language pre-training has enabled deep models to make a huge step forward in generalizing across unseen domains. The recent learning method based on the vision-language pre-training model is a great tool for domain generalization and can solve this problem to a large extent. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. In this paper, we introduce a novel prompt learning strategy that leverages deep vision prompts to address domain invariance while utilizing language prompts to ensure class separability, coupled with adaptive weighting mechanisms to balance domain invariance and class separability. Extensive experiments demonstrate that deep vision prompts effectively extract domain-invariant features, significantly improving the generalization ability of deep models and achieving state-of-the-art performance on three datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes