CVLGMay 13

Reducing Bias and Variance: Generative Semantic Guidance and Bi-Layer Ensemble for Image Clustering

arXiv:2605.1296112.0
Predicted impact top 60% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in unsupervised image clustering, this work addresses the overlooked variance issue and provides a practical framework that achieves strong empirical results.

GSEC improves image clustering by reducing both bias (via generative semantic guidance from MLLMs) and variance (via a bi-layer ensemble), outperforming 18 state-of-the-art methods on six benchmarks.

Image clustering aims to partition unlabeled image datasets into distinct groups. A core aspect of this task is constructing and leveraging prior knowledge to guide the clustering process. Recent approaches introduce semantic descriptions as prior information, most of which typically relying on matching-based techniques with predefined vocabularies. However, the limited matching space restricts their adaptability to downstream clustering tasks. Moreover, these methods primarily focus on reducing bias to improve performance, frequently overlooking the importance of variance reduction. To address these limitations, we propose GSEC (Image Clustering based on Generative Semantic Guidance and Bi-Layer Ensemble), a framework designed to reduce bias through generative semantic guidance and mitigate variance via ensemble learning. Our method employs Multimodal Large Language Models to generate semantic descriptions and derive image embeddings via weighted averaging. Additionally, a bi-layer ensemble strategy integrates cross-modal information through BatchEnsemble in the inner layer and aligns outputs via an alignment mechanism in the outer layer. Comparative experiments demonstrate that GSEC outperforms 18 state-of-the-art methods across six benchmark datasets, while further analysis confirms its effectiveness in simultaneously reducing both bias and variance. The code is available at https://github.com/2017LI/GSEC.git.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes