Sampling Acquisition Functions for Batch Bayesian Optimization
This addresses the challenge of scaling Bayesian optimization to parallel settings for researchers and practitioners in machine learning, though it is an incremental improvement over existing batch methods.
The paper tackles the problem of batch Bayesian optimization by introducing Acquisition Thompson Sampling (ATS), a method that samples acquisition functions from a stochastic process based on model hyper-parameters, and shows it outperforms classical parallel Thompson Sampling and is competitive with state-of-the-art batch methods in benchmark and hyper-parameter optimization tasks.
We present Acquisition Thompson Sampling (ATS), a novel technique for batch Bayesian Optimization (BO) based on the idea of sampling multiple acquisition functions from a stochastic process. We define this process through the dependency of the acquisition functions on a set of model hyper-parameters. ATS is conceptually simple, straightforward to implement and, unlike other batch BO methods, it can be employed to parallelize any sequential acquisition function or to make existing parallel methods scale further. We present experiments on a variety of benchmark functions and on the hyper-parameter optimization of a popular gradient boosting tree algorithm. These demonstrate the advantages of ATS with respect to classical parallel Thompson Sampling for BO, its competitiveness with two state-of-the-art batch BO methods, and its effectiveness if applied to existing parallel BO algorithms.