MLLGNov 2, 2017

Candidates vs. Noises Estimation for Large Multi-Class Classification Problem

arXiv:1711.00658v22 citations
Originality Incremental advance
AI Analysis

This addresses computational efficiency and accuracy challenges in large-scale classification problems, such as language modeling, with incremental improvements over existing methods.

The paper tackles large multi-class classification by proposing the CANE method, which selects candidate classes and samples the rest, achieving better accuracy than state-of-the-art methods like NCE and tree classifiers while gaining significant speedup compared to standard O(K) approaches.

This paper proposes a method for multi-class classification problems, where the number of classes K is large. The method, referred to as Candidates vs. Noises Estimation (CANE), selects a small subset of candidate classes and samples the remaining classes. We show that CANE is always consistent and computationally efficient. Moreover, the resulting estimator has low statistical variance approaching that of the maximum likelihood estimator, when the observed label belongs to the selected candidates with high probability. In practice, we use a tree structure with leaves as classes to promote fast beam search for candidate selection. We further apply the CANE method to estimate word probabilities in learning large neural language models. Extensive experimental results show that CANE achieves better prediction accuracy over the Noise-Contrastive Estimation (NCE), its variants and a number of the state-of-the-art tree classifiers, while it gains significant speedup compared to standard O(K) methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes