MLLGMay 18, 2018

Optimizing for Generalization in Machine Learning with Cross-Validation Gradients

arXiv:1805.07072v112 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of selecting optimal hyperparameters for practitioners in machine learning, offering a more efficient method for model tuning, though it appears incremental as it builds on existing cross-validation techniques.

The paper tackles the problem of hyperparameter optimization in machine learning by showing that cross-validation risk is differentiable for common algorithms, and proposes a cross-validation gradient method (CVGM) to efficiently optimize it in high-dimensional spaces, aiming to improve generalization performance.

Cross-validation is the workhorse of modern applied statistics and machine learning, as it provides a principled framework for selecting the model that maximizes generalization performance. In this paper, we show that the cross-validation risk is differentiable with respect to the hyperparameters and training data for many common machine learning algorithms, including logistic regression, elastic-net regression, and support vector machines. Leveraging this property of differentiability, we propose a cross-validation gradient method (CVGM) for hyperparameter optimization. Our method enables efficient optimization in high-dimensional hyperparameter spaces of the cross-validation risk, the best surrogate of the true generalization ability of our learning algorithm.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes