LGFeb 11, 2021

No-Regret Algorithms for Time-Varying Bayesian Optimization

arXiv:2102.06296v211.930 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of optimizing time-varying functions for applications like adaptive control or resource allocation, representing an incremental advance by extending regret analysis to a frequentist framework.

The paper tackles the time-varying Bayesian optimization problem by proposing two algorithms, R-GP-UCB and SW-GP-UCB, which achieve the first frequentist regret guarantees for dynamic regret in this setting, recovering previous linear bandit results and complementing Bayesian analyses.

In this paper, we consider the time-varying Bayesian optimization problem. The unknown function at each time is assumed to lie in an RKHS (reproducing kernel Hilbert space) with a bounded norm. We adopt the general variation budget model to capture the time-varying environment, and the variation is characterized by the change of the RKHS norm. We adapt the restart and sliding window mechanism to introduce two GP-UCB type algorithms: R-GP-UCB and SW-GP-UCB, respectively. We derive the first (frequentist) regret guarantee on the dynamic regret for both algorithms. Our results not only recover previous linear bandit results when a linear kernel is used, but complement the previous regret analysis of time-varying Gaussian process bandit under a Bayesian-type regularity assumption, i.e., each function is a sample from a Gaussian process.

View on arXiv PDF

Similar