AIOct 7, 2020

Bayesian Optimized Monte Carlo Planning

arXiv:2010.03597v133 citations
Originality Incremental advance
AI Analysis

This addresses a scalability problem for researchers and practitioners in AI planning and decision-making under uncertainty, though it is an incremental improvement over existing Monte Carlo tree search methods.

The paper tackles the challenge of scaling online solvers for partially observable Markov decision processes (POMDPs) with large action spaces by introducing Bayesian Optimized Monte Carlo Planning (BOMCP), which uses Bayesian optimization for efficient action sampling, resulting in improved scalability compared to existing state-of-the-art tree search solvers.

Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search tree. The performance of progressive widening search is dependent upon the action sampling policy, often requiring problem-specific samplers. In this work, we present a general method for efficient action sampling based on Bayesian optimization. The proposed method uses a Gaussian process to model a belief over the action-value function and selects the action that will maximize the expected improvement in the optimal action value. We implement the proposed approach in a new online tree search algorithm called Bayesian Optimized Monte Carlo Planning (BOMCP). Several experiments show that BOMCP is better able to scale to large action space POMDPs than existing state-of-the-art tree search solvers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes