LGMLMay 3, 2021

How Bayesian Should Bayesian Optimisation Be?

arXiv:2105.00894v18 citations
Originality Incremental advance
AI Analysis

This work addresses a methodological bottleneck in Bayesian optimization for researchers and practitioners, offering an incremental improvement over standard maximum likelihood approaches.

The paper tackles the problem of overconfident predictions in Bayesian optimization (BO) by proposing a fully-Bayesian treatment of Gaussian process hyperparameters (FBBO), finding that FBBO with Expected Improvement and an ARD kernel performs best in noise-free settings across 15 benchmark problems.

Bayesian optimisation (BO) uses probabilistic surrogate models - usually Gaussian processes (GPs) - for the optimisation of expensive black-box functions. At each BO iteration, the GP hyperparameters are fit to previously-evaluated data by maximising the marginal likelihood. However, this fails to account for uncertainty in the hyperparameters themselves, leading to overconfident model predictions. This uncertainty can be accounted for by taking the Bayesian approach of marginalising out the model hyperparameters. We investigate whether a fully-Bayesian treatment of the Gaussian process hyperparameters in BO (FBBO) leads to improved optimisation performance. Since an analytic approach is intractable, we compare FBBO using three approximate inference schemes to the maximum likelihood approach, using the Expected Improvement (EI) and Upper Confidence Bound (UCB) acquisition functions paired with ARD and isotropic Matern kernels, across 15 well-known benchmark problems for 4 observational noise settings. FBBO using EI with an ARD kernel leads to the best performance in the noise-free setting, with much less difference between combinations of BO components when the noise is increased. FBBO leads to over-exploration with UCB, but is not detrimental with EI. Therefore, we recommend that FBBO using EI with an ARD kernel as the default choice for BO.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes