CVJul 21, 2024

Rethinking Domain Adaptation and Generalization in the Era of CLIP

Ruoyu Feng, Tao Yu, Xin Jin, Xiaoyuan Yu, Lei Xiao, Zhibo Chen

arXiv:2407.15173v18.78 citationsh-index: 53

Originality Incremental advance

AI Analysis

This work addresses domain adaptation and generalization challenges for researchers and practitioners using CLIP, though it appears incremental by building on existing CLIP capabilities.

The paper tackles domain adaptation by showing that a simple domain prior improves CLIP's zero-shot recognition in specific domains, with adaptation relying less on source data due to CLIP's diverse pre-training, and it creates a benchmark for zero-shot adaptation and self-training while proposing improvements for task generalization from multiple unlabeled domains.

In recent studies on domain adaptation, significant emphasis has been placed on the advancement of learning shared knowledge from a source domain to a target domain. Recently, the large vision-language pre-trained model, i.e., CLIP has shown strong ability on zero-shot recognition, and parameter efficient tuning can further improve its performance on specific tasks. This work demonstrates that a simple domain prior boosts CLIP's zero-shot recognition in a specific domain. Besides, CLIP's adaptation relies less on source domain data due to its diverse pre-training dataset. Furthermore, we create a benchmark for zero-shot adaptation and pseudo-labeling based self-training with CLIP. Last but not least, we propose to improve the task generalization ability of CLIP from multiple unlabeled domains, which is a more practical and unique scenario. We believe our findings motivate a rethinking of domain adaptation benchmarks and the associated role of related algorithms in the era of CLIP.

View on arXiv PDF

Similar