CVAIOct 29, 2024

A Fresh Look at Generalized Category Discovery through Non-negative Matrix Factorization

arXiv:2410.21807v23 citationsh-index: 7IEEE transactions on circuits and systems for video technology (Print)
Originality Incremental advance
AI Analysis

This addresses classification of both base and novel images in computer vision, representing a strong but incremental improvement over existing GCD methods.

The paper tackles the problem of Generalized Category Discovery (GCD) by proposing a Non-Negative Generalized Category Discovery (NN-GCD) framework that uses Symmetric Non-negative Matrix Factorization to optimize co-occurrence matrices and clustering. It achieves an average accuracy of 66.1% on the Semantic Shift Benchmark, surpassing prior methods by 4.7%.

Generalized Category Discovery (GCD) aims to classify both base and novel images using labeled base data. However, current approaches inadequately address the intrinsic optimization of the co-occurrence matrix $\bar{A}$ based on cosine similarity, failing to achieve zero base-novel regions and adequate sparsity in base and novel domains. To address these deficiencies, we propose a Non-Negative Generalized Category Discovery (NN-GCD) framework. It employs Symmetric Non-negative Matrix Factorization (SNMF) as a mathematical medium to prove the equivalence of optimal K-means with optimal SNMF, and the equivalence of SNMF solver with non-negative contrastive learning (NCL) optimization. Utilizing these theoretical equivalences, it reframes the optimization of $\bar{A}$ and K-means clustering as an NCL optimization problem. Moreover, to satisfy the non-negative constraints and make a GCD model converge to a near-optimal region, we propose a GELU activation function and an NMF NCE loss. To transition $\bar{A}$ from a suboptimal state to the desired $\bar{A}^*$, we introduce a hybrid sparse regularization approach to impose sparsity constraints. Experimental results show NN-GCD outperforms state-of-the-art methods on GCD benchmarks, achieving an average accuracy of 66.1\% on the Semantic Shift Benchmark, surpassing prior counterparts by 4.7\%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes