LGMLAug 14, 2024

Differentiating Policies for Non-Myopic Bayesian Optimization

arXiv:2408.07812v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in Bayesian optimization for researchers and practitioners by making non-myopic policies more practical, though it is incremental as it builds on existing rollout methods.

The paper tackles the computational challenge of optimizing non-myopic acquisition functions in Bayesian optimization by efficiently estimating rollout acquisition functions and their gradients, enabling stochastic gradient-based optimization of sampling policies.

Bayesian optimization (BO) methods choose sample points by optimizing an acquisition function derived from a statistical model of the objective. These acquisition functions are chosen to balance sampling regions with predicted good objective values against exploring regions where the objective is uncertain. Standard acquisition functions are myopic, considering only the impact of the next sample, but non-myopic acquisition functions may be more effective. In principle, one could model the sampling by a Markov decision process, and optimally choose the next sample by maximizing an expected reward computed by dynamic programming; however, this is infeasibly expensive. More practical approaches, such as rollout, consider a parametric family of sampling policies. In this paper, we show how to efficiently estimate rollout acquisition functions and their gradients, enabling stochastic gradient-based optimization of sampling policies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes