MGLGMLFeb 28, 2020

Universal consistency of the $k$-NN rule in metric spaces and Nagata dimension

arXiv:2003.00894v29 citations
AI Analysis

This work extends a foundational result in machine learning to more general metric spaces, which is incremental but important for theoretical understanding of non-Euclidean data.

The paper proves that the k-nearest neighbor learning rule is universally consistent in metric spaces with finite Nagata dimension, extending Stone's classic result from Euclidean spaces to a broader class of metrics. It provides a direct proof addressing challenges like distance ties in non-Euclidean settings, with examples illustrating geometric limitations.

The $k$ nearest neighbour learning rule (under the uniform distance tie breaking) is universally consistent in every metric space $X$ that is sigma-finite dimensional in the sense of Nagata. This was pointed out by Cérou and Guyader (2006) as a consequence of the main result by those authors, combined with a theorem in real analysis sketched by D. Preiss (1971) (and elaborated in detail by Assouad and Quentin de Gromard (2006)). We show that it is possible to give a direct proof along the same lines as the original theorem of Charles J. Stone (1977) about the universal consistency of the $k$-NN classifier in the finite dimensional Euclidean space. The generalization is non-trivial because of the distance ties being more prevalent in the non-euclidean setting, and on the way we investigate the relevant geometric properties of the metrics and the limitations of the Stone argument, by constructing various examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes