Time-varying Gaussian Process Bandit Optimization with Non-constant Evaluation Time
This addresses a practical limitation in applications like recommender systems and environmental monitoring, but is incremental as it extends existing time-varying Bayesian optimization frameworks.
The paper tackles the problem of time-varying Bayesian optimization where evaluation times are non-constant, which degrades existing methods, and proposes a novel algorithm with a theoretical regret bound and experimental validation.
The Gaussian process bandit is a problem in which we want to find a maximizer of a black-box function with the minimum number of function evaluations. If the black-box function varies with time, then time-varying Bayesian optimization is a promising framework. However, a drawback with current methods is in the assumption that the evaluation time for every observation is constant, which can be unrealistic for many practical applications, e.g., recommender systems and environmental monitoring. As a result, the performance of current methods can be degraded when this assumption is violated. To cope with this problem, we propose a novel time-varying Bayesian optimization algorithm that can effectively handle the non-constant evaluation time. Furthermore, we theoretically establish a regret bound of our algorithm. Our bound elucidates that a pattern of the evaluation time sequence can hugely affect the difficulty of the problem. We also provide experimental results to validate the practical effectiveness of the proposed method.