MLLGMar 2, 2020

Approximate Cross-validation: Guarantees for Model Assessment and Selection

arXiv:2003.00617v229 citations
AI Analysis

This addresses the problem of high computational cost in CV for machine learning practitioners, offering a faster alternative with theoretical assurances, though it is incremental as it builds on existing approximate ERM methods.

The paper tackles the computational inefficiency of cross-validation (CV) for model assessment and selection by proposing an approximate CV method using Newton steps, providing deterministic guarantees for model assessment and selection comparable to CV, and extending it to non-smooth objectives like l1-regularized ERM with improved guarantees.

Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets. Recent work in empirical risk minimization (ERM) approximates the expensive refitting with a single Newton step warm-started from the full training set optimizer. While this can greatly reduce runtime, several open questions remain including whether these approximations lead to faithful model selection and whether they are suitable for non-smooth objectives. We address these questions with three main contributions: (i) we provide uniform non-asymptotic, deterministic model assessment guarantees for approximate CV; (ii) we show that (roughly) the same conditions also guarantee model selection performance comparable to CV; (iii) we provide a proximal Newton extension of the approximate CV framework for non-smooth prediction problems and develop improved assessment guarantees for problems such as l1-regularized ERM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes