LGAIDSIRMLMar 1, 2017

Fast k-Nearest Neighbour Search via Prioritized DCI

arXiv:1703.00440v25.738 citations
Originality Highly original
AI Analysis

This addresses the curse of dimensionality in nearest neighbor search for applications requiring exact results, offering significant speed and memory improvements over prior methods.

The paper tackles the problem of exact k-nearest neighbor search suffering from the curse of dimensionality, and proposes Prioritized DCI, which reduces query time dependence on intrinsic dimensionality from exponential to sublinear, with empirical results showing reductions in distance evaluations by factors of 14 to 116 and memory consumption by a factor of 21 compared to LSH.

Most exact methods for k-nearest neighbour search suffer from the curse of dimensionality; that is, their query times exhibit exponential dependence on either the ambient or the intrinsic dimensionality. Dynamic Continuous Indexing (DCI) offers a promising way of circumventing the curse and successfully reduces the dependence of query time on intrinsic dimensionality from exponential to sublinear. In this paper, we propose a variant of DCI, which we call Prioritized DCI, and show a remarkable improvement in the dependence of query time on intrinsic dimensionality. In particular, a linear increase in intrinsic dimensionality, or equivalently, an exponential increase in the number of points near a query, can be mostly counteracted with just a linear increase in space. We also demonstrate empirically that Prioritized DCI significantly outperforms prior methods. In particular, relative to Locality-Sensitive Hashing (LSH), Prioritized DCI reduces the number of distance evaluations by a factor of 14 to 116 and the memory consumption by a factor of 21.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes