LGFeb 26, 2025

Gaussian Process Upper Confidence Bound Achieves Nearly-Optimal Regret in Noise-Free Gaussian Process Bandits

arXiv:2502.19006v17 citationsh-index: 7
Originality Incremental advance
AI Analysis

This resolves a gap between theory and practice for GP-UCB, impacting researchers in bandit optimization and kernel methods.

The paper tackles the theoretical suboptimality of Gaussian Process Upper Confidence Bound (GP-UCB) in noise-free Gaussian Process bandits by proving it achieves nearly-optimal regret, showing the first constant cumulative regret for specific kernels like squared exponential and Matérn.

We study the noise-free Gaussian Process (GP) bandits problem, in which the learner seeks to minimize regret through noise-free observations of the black-box objective function lying on the known reproducing kernel Hilbert space (RKHS). Gaussian process upper confidence bound (GP-UCB) is the well-known GP-bandits algorithm whose query points are adaptively chosen based on the GP-based upper confidence bound score. Although several existing works have reported the practical success of GP-UCB, the current theoretical results indicate its suboptimal performance. However, GP-UCB tends to perform well empirically compared with other nearly optimal noise-free algorithms that rely on a non-adaptive sampling scheme of query points. This paper resolves this gap between theoretical and empirical performance by showing the nearly optimal regret upper bound of noise-free GP-UCB. Specifically, our analysis shows the first constant cumulative regret in the noise-free settings for the squared exponential kernel and Matérn kernel with some degree of smoothness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes