CVJul 21, 2024

Rethinking Domain Adaptation and Generalization in the Era of CLIP

arXiv:2407.15173v18 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses domain adaptation and generalization challenges for researchers and practitioners using CLIP, though it appears incremental by building on existing CLIP capabilities.

The paper tackles domain adaptation by showing that a simple domain prior improves CLIP's zero-shot recognition in specific domains, with adaptation relying less on source data due to CLIP's diverse pre-training, and it creates a benchmark for zero-shot adaptation and self-training while proposing improvements for task generalization from multiple unlabeled domains.

In recent studies on domain adaptation, significant emphasis has been placed on the advancement of learning shared knowledge from a source domain to a target domain. Recently, the large vision-language pre-trained model, i.e., CLIP has shown strong ability on zero-shot recognition, and parameter efficient tuning can further improve its performance on specific tasks. This work demonstrates that a simple domain prior boosts CLIP's zero-shot recognition in a specific domain. Besides, CLIP's adaptation relies less on source domain data due to its diverse pre-training dataset. Furthermore, we create a benchmark for zero-shot adaptation and pseudo-labeling based self-training with CLIP. Last but not least, we propose to improve the task generalization ability of CLIP from multiple unlabeled domains, which is a more practical and unique scenario. We believe our findings motivate a rethinking of domain adaptation benchmarks and the associated role of related algorithms in the era of CLIP.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes