Transition Constrained Bayesian Optimization via Markov Decision Processes
This addresses a limitation in Bayesian optimization for real-world problems with movement or monotonicity constraints, such as in physical sciences and machine calibration, but it appears incremental as it builds on existing frameworks.
The paper tackles the problem of optimizing black-box functions with transition constraints, where the search space for the next query depends on previous ones, by extending Bayesian optimization using Markov Decision Processes and reinforcement learning to plan ahead over the entire horizon. The result is demonstrated in applications like chemical reactor optimization and informative path planning, though no concrete performance numbers are provided.
Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, required monotonicity in certain variables, and transitions influencing the accuracy of measurements. Altogether, such transition constraints necessitate a form of planning. This work extends classical Bayesian optimization via the framework of Markov Decision Processes. We iteratively solve a tractable linearization of our utility function using reinforcement learning to obtain a policy that plans ahead for the entire horizon. This is a parallel to the optimization of an acquisition function in policy space. The resulting policy is potentially history-dependent and non-Markovian. We showcase applications in chemical reactor optimization, informative path planning, machine calibration, and other synthetic examples.