LGMLSep 29, 2013

An upper bound on prototype set size for condensed nearest neighbor

arXiv:1309.7676v1
Originality Synthesis-oriented
AI Analysis

This provides a theoretical guarantee for a heuristic method in machine learning, which is incremental as it builds on existing algorithms without introducing a new paradigm.

The paper tackles the problem of bounding the number of prototypical points stored by the condensed nearest neighbor algorithm, deriving an upper bound that is independent of training set size based on a connection to the multiclass perceptron algorithm.

The condensed nearest neighbor (CNN) algorithm is a heuristic for reducing the number of prototypical points stored by a nearest neighbor classifier, while keeping the classification rule given by the reduced prototypical set consistent with the full set. I present an upper bound on the number of prototypical points accumulated by CNN. The bound originates in a bound on the number of times the decision rule is updated during training in the multiclass perceptron algorithm, and thus is independent of training set size.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes