Efficient batch-sequential Bayesian optimization with moments of truncated Gaussian vectors
This work addresses the efficiency bottleneck in parallel Bayesian optimization for researchers and practitioners in machine learning and optimization.
The paper tackles the computational challenge of parallelizing Bayesian optimization algorithms by developing a closed-form formula for a generalized multipoint expected improvement criterion, which enables gradient-based batch optimization. The proposed approaches achieve computational savings of 1-2 orders of magnitude, making derivative-based batch-sequential acquisition function maximization practically implementable.
We deal with the efficient parallelization of Bayesian global optimization algorithms, and more specifically of those based on the expected improvement criterion and its variants. A closed form formula relying on multivariate Gaussian cumulative distribution functions is established for a generalized version of the multipoint expected improvement criterion. In turn, the latter relies on intermediate results that could be of independent interest concerning moments of truncated Gaussian vectors. The obtained expansion of the criterion enables studying its differentiability with respect to point batches and calculating the corresponding gradient in closed form. Furthermore , we derive fast numerical approximations of this gradient and propose efficient batch optimization strategies. Numerical experiments illustrate that the proposed approaches enable computational savings of between one and two order of magnitudes, hence enabling derivative-based batch-sequential acquisition function maximization to become a practically implementable and efficient standard.