MLLGAug 20, 2021

Optimal Order Simple Regret for Gaussian Process Bandits

arXiv:2108.09262v167 citations
AI Analysis

This work provides a near-optimal solution for sequential optimization problems in fields like hyperparameter tuning or experimental design, though it is incremental as it refines existing bounds rather than introducing a new paradigm.

The paper addresses the gap between lower and upper bounds on simple regret in Gaussian Process bandits for sequential optimization of expensive functions, proving an order-optimal bound of $ ilde{\mathcal{O}}(\sqrt{\gamma_N/N})$ that is significantly tighter than existing results.

Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function $f$. The problem can be cast as a Gaussian Process (GP) bandit where $f$ lives in a reproducing kernel Hilbert space (RKHS). The state of the art analysis of several learning algorithms shows a significant gap between the lower and upper bounds on the simple regret performance. When $N$ is the number of exploration trials and $γ_N$ is the maximal information gain, we prove an $\tilde{\mathcal{O}}(\sqrt{γ_N/N})$ bound on the simple regret performance of a pure exploration algorithm that is significantly tighter than the existing bounds. We show that this bound is order optimal up to logarithmic factors for the cases where a lower bound on regret is known. To establish these results, we prove novel and sharp confidence intervals for GP models applicable to RKHS elements which may be of broader interest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes