LGMLMay 27, 2023

PFNs4BO: In-Context Learning for Bayesian Optimization

arXiv:2305.17535v579 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the need for more adaptable and efficient BO methods in machine learning, particularly for hyperparameter optimization, though it is incremental as it builds on existing PFN and BO techniques.

The paper tackles the problem of making Bayesian Optimization (BO) more flexible by using Prior-data Fitted Networks (PFNs) as a surrogate model, enabling in-context learning on various priors and extensions like user priors and non-myopic BO, and demonstrates its usefulness in large-scale evaluations on hyperparameter optimization testbeds such as HPO-B, Bayesmark, and PD1.

In this paper, we use Prior-data Fitted Networks (PFNs) as a flexible surrogate for Bayesian Optimization (BO). PFNs are neural processes that are trained to approximate the posterior predictive distribution (PPD) through in-context learning on any prior distribution that can be efficiently sampled from. We describe how this flexibility can be exploited for surrogate modeling in BO. We use PFNs to mimic a naive Gaussian process (GP), an advanced GP, and a Bayesian Neural Network (BNN). In addition, we show how to incorporate further information into the prior, such as allowing hints about the position of optima (user priors), ignoring irrelevant dimensions, and performing non-myopic BO by learning the acquisition function. The flexibility underlying these extensions opens up vast possibilities for using PFNs for BO. We demonstrate the usefulness of PFNs for BO in a large-scale evaluation on artificial GP samples and three different hyperparameter optimization testbeds: HPO-B, Bayesmark, and PD1. We publish code alongside trained models at github.com/automl/PFNs4BO.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes