LGMLJun 27, 2012

Shortest path distance in random k-nearest neighbor graphs

arXiv:1206.6381v248 citations
Originality Synthesis-oriented
AI Analysis

This addresses a theoretical issue in graph-based machine learning methods, revealing limitations in using unweighted kNN graphs for distance estimation.

The paper investigates the convergence of shortest path distances in random k-nearest neighbor graphs as sample size increases, proving that in unweighted graphs, it converges to a problematic distance function harmful for machine learning, and also examines weighted graph behavior.

Consider a weighted or unweighted k-nearest neighbor graph that has been built on n data points drawn randomly according to some density p on R^d. We study the convergence of the shortest path distance in such graphs as the sample size tends to infinity. We prove that for unweighted kNN graphs, this distance converges to an unpleasant distance function on the underlying space whose properties are detrimental to machine learning. We also study the behavior of the shortest path distance in weighted kNN graphs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes