CVAIJul 24, 2021

Clustering by Maximizing Mutual Information Across Views

arXiv:2107.11635v154 citations
Originality Incremental advance
AI Analysis

This addresses the problem of improving clustering accuracy for image data, though it appears incremental as it builds on existing contrastive learning and clustering techniques.

The paper tackles image clustering by proposing a framework with joint representation learning and clustering heads, achieving a 5-7% accuracy improvement over state-of-the-art methods on datasets like CIFAR10/20, STL10, and ImageNet-Dogs.

We propose a novel framework for image clustering that incorporates joint representation learning and clustering. Our method consists of two heads that share the same backbone network - a "representation learning" head and a "clustering" head. The "representation learning" head captures fine-grained patterns of objects at the instance level which serve as clues for the "clustering" head to extract coarse-grain information that separates objects into clusters. The whole model is trained in an end-to-end manner by minimizing the weighted sum of two sample-oriented contrastive losses applied to the outputs of the two heads. To ensure that the contrastive loss corresponding to the "clustering" head is optimal, we introduce a novel critic function called "log-of-dot-product". Extensive experimental results demonstrate that our method significantly outperforms state-of-the-art single-stage clustering methods across a variety of image datasets, improving over the best baseline by about 5-7% in accuracy on CIFAR10/20, STL10, and ImageNet-Dogs. Further, the "two-stage" variant of our method also achieves better results than baselines on three challenging ImageNet subsets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes