LGFeb 26, 2025

Gaussian Process Upper Confidence Bound Achieves Nearly-Optimal Regret in Noise-Free Gaussian Process Bandits

arXiv:2502.19006v115.77 citationsh-index: 7

Originality Incremental advance

AI Analysis

This resolves a gap between theory and practice for GP-UCB, impacting researchers in bandit optimization and kernel methods.

The paper tackles the theoretical suboptimality of Gaussian Process Upper Confidence Bound (GP-UCB) in noise-free Gaussian Process bandits by proving it achieves nearly-optimal regret, showing the first constant cumulative regret for specific kernels like squared exponential and Matérn.

We study the noise-free Gaussian Process (GP) bandits problem, in which the learner seeks to minimize regret through noise-free observations of the black-box objective function lying on the known reproducing kernel Hilbert space (RKHS). Gaussian process upper confidence bound (GP-UCB) is the well-known GP-bandits algorithm whose query points are adaptively chosen based on the GP-based upper confidence bound score. Although several existing works have reported the practical success of GP-UCB, the current theoretical results indicate its suboptimal performance. However, GP-UCB tends to perform well empirically compared with other nearly optimal noise-free algorithms that rely on a non-adaptive sampling scheme of query points. This paper resolves this gap between theoretical and empirical performance by showing the nearly optimal regret upper bound of noise-free GP-UCB. Specifically, our analysis shows the first constant cumulative regret in the noise-free settings for the squared exponential kernel and Matérn kernel with some degree of smoothness.

View on arXiv PDF

Similar