CVSep 26, 2025

Category Discovery: An Open-World Perspective

arXiv:2509.22542v26 citationsh-index: 2Has Code
Originality Synthesis-oriented
AI Analysis

It addresses the problem of automated categorization in open-world scenarios for researchers and practitioners, but is incremental as it synthesizes existing literature.

This survey reviews category discovery (CD), an open-world learning task for automatically categorizing unlabeled data from unseen classes using labeled data from seen classes, analyzing methods across settings like novel and generalized category discovery and benchmarking them to identify beneficial techniques and remaining challenges.

Category discovery (CD) is an emerging open-world learning task, which aims at automatically categorizing unlabelled data containing instances from unseen classes, given some labelled data from seen classes. This task has attracted significant attention over the years and leads to a rich body of literature trying to address the problem from different perspectives. In this survey, we provide a comprehensive review of the literature, and offer detailed analysis and in-depth discussion on different methods. Firstly, we introduce a taxonomy for the literature by considering two base settings, namely novel category discovery (NCD) and generalized category discovery (GCD), and several derived settings that are designed to address the extra challenges in different real-world application scenarios, including continual category discovery, skewed data distribution, federated category discovery, etc. Secondly, for each setting, we offer a detailed analysis of the methods encompassing three fundamental components, representation learning, label assignment, and estimation of class number. Thirdly, we benchmark all the methods and distill key insights showing that large-scale pretrained backbones, hierarchical and auxiliary cues, and curriculum-style training are all beneficial for category discovery, while challenges remain in the design of label assignment, the estimation of class numbers, and scaling to complex multi-object scenarios. Finally, we discuss the key insights from the literature so far and point out promising future research directions. We compile a living survey of the category discovery literature at https://github.com/Visual-AI/Category-Discovery.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes