Achieving the time of $1$-NN, but the accuracy of $k$-NN
This addresses a computational bottleneck in machine learning for applications requiring fast, accurate classification, though it is incremental as it builds on existing nearest neighbor methods.
The paper tackles the problem of balancing accuracy and speed in nearest neighbor classification by proposing a distributed approach that aggregates denoised 1-NN predictors over subsamples to achieve k-NN accuracy while maintaining 1-NN prediction time, with theoretical and experimental validation showing small subsample sizes suffice.
We propose a simple approach which, given distributed computing resources, can nearly achieve the accuracy of $k$-NN prediction, while matching (or improving) the faster prediction time of $1$-NN. The approach consists of aggregating denoised $1$-NN predictors over a small number of distributed subsamples. We show, both theoretically and experimentally, that small subsample sizes suffice to attain similar performance as $k$-NN, without sacrificing the computational efficiency of $1$-NN.