Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation
This addresses the challenge of reliably optimizing acquisition functions for practitioners in Bayesian optimization, offering a provably global method that is incremental over existing sampling- or gradient-based techniques.
The paper tackles the non-convex optimization of acquisition functions in Bayesian optimization by proposing a mixed-integer programming approach with a piecewise-linear kernel approximation, achieving global convergence with theoretical regret bounds and empirical validation on synthetic and real-world tasks.
Bayesian optimization relies on iteratively constructing and optimizing an acquisition function. The latter turns out to be a challenging, non-convex optimization problem itself. Despite the relative importance of this step, most algorithms employ sampling- or gradient-based methods, which do not provably converge to global optima. This work investigates mixed-integer programming (MIP) as a paradigm for global acquisition function optimization. Specifically, our Piecewise-linear Kernel Mixed Integer Quadratic Programming (PK-MIQP) formulation introduces a piecewise-linear approximation for Gaussian process kernels and admits a corresponding MIQP representation for acquisition functions. The proposed method is applicable to uncertainty-based acquisition functions for any stationary or dot-product kernel. We analyze the theoretical regret bounds of the proposed approximation, and empirically demonstrate the framework on synthetic functions, constrained benchmarks, and a hyperparameter tuning task.