MLLGOCFeb 7, 2016

Hyperparameter optimization with approximate gradient

arXiv:1602.02355v6520 citations
Originality Incremental advance
AI Analysis

This work addresses the computationally challenging task of hyperparameter tuning for machine learning practitioners, though it appears incremental as it builds on existing gradient-based optimization approaches.

The authors tackled the problem of hyperparameter optimization by proposing an algorithm that uses approximate gradients, allowing hyperparameter updates before model parameters fully converge. They validated the method on L2-regularized logistic regression and kernel Ridge regression, showing it is highly competitive with state-of-the-art methods.

Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparameters can be updated before model parameters have fully converged. We also give sufficient conditions for the global convergence of this method, based on regularity conditions of the involved functions and summability of errors. Finally, we validate the empirical performance of this method on the estimation of regularization constants of L2-regularized logistic regression and kernel Ridge regression. Empirical benchmarks indicate that our approach is highly competitive with respect to state of the art methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes