LG CVApr 1, 2022

Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation

Kendrick Shen, Robbie Jones, Ananya Kumar, Sang Michael Xie, Jeff Z. HaoChen, Tengyu Ma, Percy Liang

Stanford

arXiv:2204.00570v428.2104 citationsh-index: 102

Originality Incremental advance

AI Analysis

This work addresses domain adaptation for machine learning practitioners by demonstrating that domain invariance is not necessary for generalization, offering a novel perspective, though it is incremental as it builds on existing contrastive learning methods.

The paper tackles unsupervised domain adaptation (UDA) by showing that contrastive pre-training, which learns features on unlabeled source and target data and fine-tunes on labeled source data, is competitive with strong UDA methods, achieving results like 85.2% accuracy on Office-Home, but does not rely on domain-invariant features, challenging conventional UDA intuitions.

We consider unsupervised domain adaptation (UDA), where labeled data from a source domain (e.g., photographs) and unlabeled data from a target domain (e.g., sketches) are used to learn a classifier for the target domain. Conventional UDA methods (e.g., domain adversarial training) learn domain-invariant features to improve generalization to the target domain. In this paper, we show that contrastive pre-training, which learns features on unlabeled source and target data and then fine-tunes on labeled source data, is competitive with strong UDA methods. However, we find that contrastive pre-training does not learn domain-invariant features, diverging from conventional UDA intuitions. We show theoretically that contrastive pre-training can learn features that vary subtantially across domains but still generalize to the target domain, by disentangling domain and class information. Our results suggest that domain invariance is not necessary for UDA. We empirically validate our theory on benchmark vision datasets.

View on arXiv PDF

Similar