LGCVJan 31, 2023

Domain-Generalizable Multiple-Domain Clustering

Meta AI
arXiv:2301.13530v211 citationsh-index: 63Has Code
AI Analysis

This work addresses the challenge of domain generalization in clustering for scenarios with unlabeled data, which is incremental by extending unsupervised methods to multiple domains.

The paper tackles the problem of unsupervised domain generalization for clustering, where no labeled samples are available, by learning a shared predictor from multiple source domains to assign examples to semantically related clusters in unseen domains. The result shows that their model is more accurate than baselines requiring fine-tuning or supervision, as demonstrated empirically.

This work generalizes the problem of unsupervised domain generalization to the case in which no labeled samples are available (completely unsupervised). We are given unlabeled samples from multiple source domains, and we aim to learn a shared predictor that assigns examples to semantically related clusters. Evaluation is done by predicting cluster assignments in previously unseen domains. Towards this goal, we propose a two-stage training framework: (1) self-supervised pre-training for extracting domain invariant semantic features. (2) multi-head cluster prediction with pseudo labels, which rely on both the feature space and cluster head prediction, further leveraging a novel prediction-based label smoothing scheme. We demonstrate empirically that our model is more accurate than baselines that require fine-tuning using samples from the target domain or some level of supervision. Our code is available at https://github.com/AmitRozner/domain-generalizable-multiple-domain-clustering.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes