LGAug 9, 2023

When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis

Yiyou Sun, Zhenmei Shi, Yingyu Liang, Yixuan Li

Berkeley

arXiv:2308.05017v116.526 citationsh-index: 50Has Code

Originality Highly original

AI Analysis

It addresses the problem of discovering novel classes in unlabeled data using labeled data for researchers and practitioners in machine learning, offering theoretical insights and practical improvements.

The paper tackles the lack of theoretical foundations in Novel Class Discovery (NCD) by providing an analytical framework to understand when and how known classes help discover novel ones, introducing a graph-theoretic representation with a novel loss function that yields provable error bounds and matches or outperforms baselines on benchmarks.

Novel Class Discovery (NCD) aims at inferring novel classes in an unlabeled set by leveraging prior knowledge from a labeled set with known classes. Despite its importance, there is a lack of theoretical foundations for NCD. This paper bridges the gap by providing an analytical framework to formalize and investigate when and how known classes can help discover novel classes. Tailored to the NCD problem, we introduce a graph-theoretic representation that can be learned by a novel NCD Spectral Contrastive Loss (NSCL). Minimizing this objective is equivalent to factorizing the graph's adjacency matrix, which allows us to derive a provable error bound and provide the sufficient and necessary condition for NCD. Empirically, NSCL can match or outperform several strong baselines on common benchmark datasets, which is appealing for practical usage while enjoying theoretical guarantees.

View on arXiv PDF Code

Similar