LGApr 25, 2024

In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization

arXiv:2404.16795v330 citationsh-index: 11ICML
Originality Highly original
AI Analysis

This addresses the problem of high computational costs in hyperparameter optimization for deep learning researchers and practitioners, offering a significant speed improvement but is incremental as it builds on existing freeze-thaw methods.

The paper tackles the computational inefficiency of hyperparameter optimization in deep learning by proposing FT-PFN, a transformer-based surrogate for freeze-thaw Bayesian optimization, which achieves 10-100 times faster predictions and new state-of-the-art performance on benchmark suites.

With the increasing computational costs associated with deep learning, automated hyperparameter optimization methods, strongly relying on black-box Bayesian optimization (BO), face limitations. Freeze-thaw BO offers a promising grey-box alternative, strategically allocating scarce resources incrementally to different configurations. However, the frequent surrogate model updates inherent to this approach pose challenges for existing methods, requiring retraining or fine-tuning their neural network surrogates online, introducing overhead, instability, and hyper-hyperparameters. In this work, we propose FT-PFN, a novel surrogate for Freeze-thaw style BO. FT-PFN is a prior-data fitted network (PFN) that leverages the transformers' in-context learning ability to efficiently and reliably do Bayesian learning curve extrapolation in a single forward pass. Our empirical analysis across three benchmark suites shows that the predictions made by FT-PFN are more accurate and 10-100 times faster than those of the deep Gaussian process and deep ensemble surrogates used in previous work. Furthermore, we show that, when combined with our novel acquisition mechanism (MFPI-random), the resulting in-context freeze-thaw BO method (ifBO), yields new state-of-the-art performance in the same three families of deep learning HPO benchmarks considered in prior work.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes