Local Bayesian optimization via maximizing probability of descent
This work addresses the challenge of efficient local optimization for practitioners dealing with costly, gradient-free functions, though it is incremental as it refines an existing scheme.
The paper tackled the problem of local Bayesian optimization for expensive, high-dimensional black-box functions by introducing a method that maximizes the probability of descent, rather than relying on the expected gradient direction. The result showed that this approach outperforms previous methods and is competitive with more complex baselines in experiments on synthetic and real-world objectives.
Local optimization presents a promising approach to expensive, high-dimensional black-box optimization by sidestepping the need to globally explore the search space. For objective functions whose gradient cannot be evaluated directly, Bayesian optimization offers one solution -- we construct a probabilistic model of the objective, design a policy to learn about the gradient at the current location, and use the resulting information to navigate the objective landscape. Previous work has realized this scheme by minimizing the variance in the estimate of the gradient, then moving in the direction of the expected gradient. In this paper, we re-examine and refine this approach. We demonstrate that, surprisingly, the expected value of the gradient is not always the direction maximizing the probability of descent, and in fact, these directions may be nearly orthogonal. This observation then inspires an elegant optimization scheme seeking to maximize the probability of descent while moving in the direction of most-probable descent. Experiments on both synthetic and real-world objectives show that our method outperforms previous realizations of this optimization scheme and is competitive against other, significantly more complicated baselines.