AICVJun 8, 2025

Long-Tailed Learning for Generalized Category Discovery

arXiv:2506.06965v11 citationsh-index: 2Neurocomputing
Originality Incremental advance
AI Analysis

This addresses the challenge of discovering novel classes in imbalanced datasets, which is crucial for real-world applications, though it appears incremental as it builds on existing generalized category discovery methods.

The paper tackles the problem of generalized category discovery in long-tailed distributions, where real-world data imbalance affects existing methods, and proposes a framework with self-guided labeling and representation balancing that exceeds previous state-of-the-art performance.

Generalized Category Discovery (GCD) utilizes labeled samples of known classes to discover novel classes in unlabeled samples. Existing methods show effective performance on artificial datasets with balanced distributions. However, real-world datasets are always imbalanced, significantly affecting the effectiveness of these methods. To solve this problem, we propose a novel framework that performs generalized category discovery in long-tailed distributions. We first present a self-guided labeling technique that uses a learnable distribution to generate pseudo-labels, resulting in less biased classifiers. We then introduce a representation balancing process to derive discriminative representations. By mining sample neighborhoods, this process encourages the model to focus more on tail classes. We conduct experiments on public datasets to demonstrate the effectiveness of the proposed framework. The results show that our model exceeds previous state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes