Pareto-efficient Acquisition Functions for Cost-Aware Bayesian Optimization
This work provides a more robust and efficient method for optimizing expensive black-box functions for researchers and practitioners in machine learning, especially when evaluation costs vary significantly.
This paper addresses the problem of cost-aware Bayesian Optimization (BO) where hyperparameter evaluations have varying costs. The authors reformulate the problem using Pareto efficiency and introduce a novel Pareto-efficient adaptation of expected improvement, achieving up to 50% speed-ups on 144 real-world black-box optimization problems.
Bayesian optimization (BO) is a popular method to optimize expensive black-box functions. It efficiently tunes machine learning algorithms under the implicit assumption that hyperparameter evaluations cost approximately the same. In reality, the cost of evaluating different hyperparameters, be it in terms of time, dollars or energy, can span several orders of magnitude of difference. While a number of heuristics have been proposed to make BO cost-aware, none of these have been proven to work robustly. In this work, we reformulate cost-aware BO in terms of Pareto efficiency and introduce the cost Pareto Front, a mathematical object allowing us to highlight the shortcomings of commonly used acquisition functions. Based on this, we propose a novel Pareto-efficient adaptation of the expected improvement. On 144 real-world black-box function optimization problems we show that our Pareto-efficient acquisition functions significantly outperform previous solutions, bringing up to 50% speed-ups while providing finer control over the cost-accuracy trade-off. We also revisit the common choice of Gaussian process cost models, showing that simple, low-variance cost models predict training times effectively.