MLDSLGCOMEMay 8, 2024

Fast Computation of Leave-One-Out Cross-Validation for $k$-NN Regression

arXiv:2405.04919v24 citationsh-index: 13Trans. Mach. Learn. Res.
Originality Incremental advance
AI Analysis

This provides a significant speed-up for model selection in k-NN regression, though it is incremental as it builds on existing methods with a specific optimization.

The paper tackles the computational inefficiency of leave-one-out cross-validation (LOOCV) for k-nearest neighbors regression by proving that the LOOCV estimate equals the mean square error of (k+1)-NN regression on training data scaled by a factor, enabling fast computation with a single model fit.

We describe a fast computation method for leave-one-out cross-validation (LOOCV) for $k$-nearest neighbours ($k$-NN) regression. We show that, under a tie-breaking condition for nearest neighbours, the LOOCV estimate of the mean square error for $k$-NN regression is identical to the mean square error of $(k+1)$-NN regression evaluated on the training data, multiplied by the scaling factor $(k+1)^2/k^2$. Therefore, to compute the LOOCV score, one only needs to fit $(k+1)$-NN regression only once, and does not need to repeat training-validation of $k$-NN regression for the number of training data. Numerical experiments confirm the validity of the fast computation method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes