Sequential Bayesian Optimisation as a POMDP for Environment Monitoring with UAVs
This work addresses practical constraints in field robotics, such as UAV environment monitoring, by integrating sequential decision-making into optimization, though it is incremental as it builds on existing BO and POMDP methods.
The paper tackled the problem of applying Bayesian Optimization to robotic systems with physical and trajectory constraints by formulating it as a Partially Observable Markov Decision Process (POMDP) and solving it with Monte-Carlo Tree Search (MCTS). The result showed that their BO-POMDP algorithm outperformed competing techniques in experiments monitoring a spatial phenomenon with a UAV.
Bayesian Optimisation has gained much popularity lately, as a global optimisation technique for functions that are expensive to evaluate or unknown a priori. While classical BO focuses on where to gather an observation next, it does not take into account practical constraints for a robotic system such as where it is physically possible to gather samples from, nor the sequential nature of the problem while executing a trajectory. In field robotics and other real-life situations, physical and trajectory constraints are inherent problems. This paper addresses these issues by formulating Bayesian Optimisation for continuous trajectories within a Partially Observable Markov Decision Process (POMDP) framework. The resulting POMDP is solved using Monte-Carlo Tree Search (MCTS), which we adapt to using a reward function balancing exploration and exploitation. Experiments on monitoring a spatial phenomenon with a UAV illustrate how our BO-POMDP algorithm outperforms competing techniques.