CVAIMar 31, 2023

Exploring the Limits of Deep Image Clustering using Pretrained Models

arXiv:2303.17896v242 citationsh-index: 6Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of unsupervised image classification for computer vision researchers, offering incremental improvements over existing methods.

The paper tackles the problem of deep image clustering without labels by leveraging pretrained feature extractors and a novel objective based on pointwise mutual information, resulting in improved clustering accuracy by 6.1% on ImageNet and 12.2% on CIFAR100 compared to k-means, and achieving 61.6% accuracy on ImageNet with self-supervised vision transformers.

We present a general methodology that learns to classify images without labels by leveraging pretrained feature extractors. Our approach involves self-distillation training of clustering heads based on the fact that nearest neighbours in the pretrained feature space are likely to share the same label. We propose a novel objective that learns associations between image features by introducing a variant of pointwise mutual information together with instance weighting. We demonstrate that the proposed objective is able to attenuate the effect of false positive pairs while efficiently exploiting the structure in the pretrained feature space. As a result, we improve the clustering accuracy over $k$-means on $17$ different pretrained models by $6.1$\% and $12.2$\% on ImageNet and CIFAR100, respectively. Finally, using self-supervised vision transformers, we achieve a clustering accuracy of $61.6$\% on ImageNet. The code is available at https://github.com/HHU-MMBS/TEMI-official-BMVC2023.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes