LGMLJan 16, 2020

Scalable Hyperparameter Optimization with Lazy Gaussian Processes

arXiv:2001.05726v11 citations
Originality Incremental advance
AI Analysis

This work addresses scalability issues in hyperparameter optimization for machine learning practitioners, enabling faster and more efficient tuning, though it is incremental as it builds on existing Bayesian Optimization methods.

The paper tackles the cubic computational complexity of Gaussian Process-based Bayesian Optimization for hyperparameter tuning by introducing a novel approximation that reduces complexity to quadratic, achieving speedups of up to 162x on a single node and further gains in parallel settings.

Most machine learning methods require careful selection of hyper-parameters in order to train a high performing model with good generalization abilities. Hence, several automatic selection algorithms have been introduced to overcome tedious manual (try and error) tuning of these parameters. Due to its very high sample efficiency, Bayesian Optimization over a Gaussian Processes modeling of the parameter space has become the method of choice. Unfortunately, this approach suffers from a cubic compute complexity due to underlying Cholesky factorization, which makes it very hard to be scaled beyond a small number of sampling steps. In this paper, we present a novel, highly accurate approximation of the underlying Gaussian Process. Reducing its computational complexity from cubic to quadratic allows an efficient strong scaling of Bayesian Optimization while outperforming the previous approach regarding optimization accuracy. The first experiments show speedups of a factor of 162 in single node and further speed up by a factor of 5 in a parallel environment.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes