STMLDec 6, 2017

Achieving the time of $1$-NN, but the accuracy of $k$-NN

arXiv:1712.02369v214 citations
Originality Incremental advance
AI Analysis

This addresses a computational bottleneck in machine learning for applications requiring fast, accurate classification, though it is incremental as it builds on existing nearest neighbor methods.

The paper tackles the problem of balancing accuracy and speed in nearest neighbor classification by proposing a distributed approach that aggregates denoised 1-NN predictors over subsamples to achieve k-NN accuracy while maintaining 1-NN prediction time, with theoretical and experimental validation showing small subsample sizes suffice.

We propose a simple approach which, given distributed computing resources, can nearly achieve the accuracy of $k$-NN prediction, while matching (or improving) the faster prediction time of $1$-NN. The approach consists of aggregating denoised $1$-NN predictors over a small number of distributed subsamples. We show, both theoretically and experimentally, that small subsample sizes suffice to attain similar performance as $k$-NN, without sacrificing the computational efficiency of $1$-NN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes