CVLGSep 7, 2023

Adapting Self-Supervised Representations to Multi-Domain Setups

arXiv:2309.03999v2h-index: 49
Originality Incremental advance
AI Analysis

This addresses the problem of deploying self-supervised models in diverse real-world setups for AI practitioners, though it is incremental as it builds on existing self-supervised methods.

The paper tackles the problem of self-supervised models having limited generalization to unseen domains, even when trained on mixed domains, by proposing a Domain Disentanglement Module (DDM) that splits representations into domain-variant and domain-invariant portions. The result shows up to 3.5% improvement in linear probing accuracy on multi-domain benchmarks and 7.4% better generalization to unseen domains compared to baselines.

Current state-of-the-art self-supervised approaches, are effective when trained on individual domains but show limited generalization on unseen domains. We observe that these models poorly generalize even when trained on a mixture of domains, making them unsuitable to be deployed under diverse real-world setups. We therefore propose a general-purpose, lightweight Domain Disentanglement Module (DDM) that can be plugged into any self-supervised encoder to effectively perform representation learning on multiple, diverse domains with or without shared classes. During pre-training according to a self-supervised loss, DDM enforces a disentanglement in the representation space by splitting it into a domain-variant and a domain-invariant portion. When domain labels are not available, DDM uses a robust clustering approach to discover pseudo-domains. We show that pre-training with DDM can show up to 3.5% improvement in linear probing accuracy on state-of-the-art self-supervised models including SimCLR, MoCo, BYOL, DINO, SimSiam and Barlow Twins on multi-domain benchmarks including PACS, DomainNet and WILDS. Models trained with DDM show significantly improved generalization (7.4%) to unseen domains compared to baselines. Therefore, DDM can efficiently adapt self-supervised encoders to provide high-quality, generalizable representations for diverse multi-domain data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes