LGAug 9, 2023

When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis

Berkeley
arXiv:2308.05017v126 citationsh-index: 50
Originality Highly original
AI Analysis

It addresses the problem of discovering novel classes in unlabeled data using labeled data for researchers and practitioners in machine learning, offering theoretical insights and practical improvements.

The paper tackles the lack of theoretical foundations in Novel Class Discovery (NCD) by providing an analytical framework to understand when and how known classes help discover novel ones, introducing a graph-theoretic representation with a novel loss function that yields provable error bounds and matches or outperforms baselines on benchmarks.

Novel Class Discovery (NCD) aims at inferring novel classes in an unlabeled set by leveraging prior knowledge from a labeled set with known classes. Despite its importance, there is a lack of theoretical foundations for NCD. This paper bridges the gap by providing an analytical framework to formalize and investigate when and how known classes can help discover novel classes. Tailored to the NCD problem, we introduce a graph-theoretic representation that can be learned by a novel NCD Spectral Contrastive Loss (NSCL). Minimizing this objective is equivalent to factorizing the graph's adjacency matrix, which allows us to derive a provable error bound and provide the sufficient and necessary condition for NCD. Empirically, NSCL can match or outperform several strong baselines on common benchmark datasets, which is appealing for practical usage while enjoying theoretical guarantees.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes