LGNov 6, 2023

The Pursuit of Human Labeling: A New Perspective on Unsupervised Learning

arXiv:2311.02940v19 citationsh-index: 12
Originality Highly original
AI Analysis

This work addresses the problem of unsupervised learning for AI researchers by introducing a novel paradigm that could reduce reliance on labeled data, though it is incremental in applying linear separability to existing representations.

The paper tackles unsupervised learning by proposing HUME, a model-agnostic framework that infers human labeling without supervision, based on the insight that human-defined classes are linearly separable across representation spaces. It outperforms supervised linear classifiers on STL-10 and achieves state-of-the-art results on benchmarks including ImageNet-1000.

We present HUME, a simple model-agnostic framework for inferring human labeling of a given dataset without any external supervision. The key insight behind our approach is that classes defined by many human labelings are linearly separable regardless of the representation space used to represent a dataset. HUME utilizes this insight to guide the search over all possible labelings of a dataset to discover an underlying human labeling. We show that the proposed optimization objective is strikingly well-correlated with the ground truth labeling of the dataset. In effect, we only train linear classifiers on top of pretrained representations that remain fixed during training, making our framework compatible with any large pretrained and self-supervised model. Despite its simplicity, HUME outperforms a supervised linear classifier on top of self-supervised representations on the STL-10 dataset by a large margin and achieves comparable performance on the CIFAR-10 dataset. Compared to the existing unsupervised baselines, HUME achieves state-of-the-art performance on four benchmark image classification datasets including the large-scale ImageNet-1000 dataset. Altogether, our work provides a fundamentally new view to tackle unsupervised learning by searching for consistent labelings between different representation spaces.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes