Benefits of Monotonicity in Safe Exploration with Gaussian Processes
This work addresses safe exploration for applications like adaptive clinical trials and robotics, but it is incremental as it builds on existing GP-UCB and SafeOpt methods with a monotonicity assumption.
The paper tackles the problem of safely maximizing an unknown function under a safety threshold by assuming monotonicity with respect to a safety variable, proposing the M-SafeUCB algorithm with theoretical guarantees for safety and regret, and showing benefits in algorithmic efficiency and empirical evaluations including a clinical trial simulation.
We consider the problem of sequentially maximising an unknown function over a set of actions while ensuring that every sampled point has a function value below a given safety threshold. We model the function using kernel-based and Gaussian process methods, while differing from previous works in our assumption that the function is monotonically increasing with respect to a \emph{safety variable}. This assumption is motivated by various practical applications such as adaptive clinical trial design and robotics. Taking inspiration from the \textsc{\sffamily GP-UCB} and \textsc{\sffamily SafeOpt} algorithms, we propose an algorithm, monotone safe {\sffamily UCB} (\textsc{\sffamily M-SafeUCB}) for this task. We show that \textsc{\sffamily M-SafeUCB} enjoys theoretical guarantees in terms of safety, a suitably-defined regret notion, and approximately finding the entire safe boundary. In addition, we illustrate that the monotonicity assumption yields significant benefits in terms of the guarantees obtained, as well as algorithmic simplicity and efficiency. We support our theoretical findings by performing empirical evaluations on a variety of functions, including a simulated clinical trial experiment.