An On-Line POMDP Solver for Continuous Observation Spaces
This addresses a key bottleneck for autonomous robots in handling continuous observations, though it is incremental as it builds on existing POMDP and Monte-Carlo methods.
The paper tackles the challenge of planning under partial observability with continuous observation spaces in POMDPs, proposing LABECOP, an on-line solver that avoids discretization and observation limits, and it performs similarly or better than state-of-the-art solvers in experiments on three problems.
Planning under partial obervability is essential for autonomous robots. A principled way to address such planning problems is the Partially Observable Markov Decision Process (POMDP). Although solving POMDPs is computationally intractable, substantial advancements have been achieved in developing approximate POMDP solvers in the past two decades. However, computing robust solutions for problems with continuous observation spaces remains challenging. Most on-line solvers rely on discretising the observation space or artificially limiting the number of observations that are considered during planning to compute tractable policies. In this paper we propose a new on-line POMDP solver, called Lazy Belief Extraction for Continuous POMDPs (LABECOP), that combines methods from Monte-Carlo-Tree-Search and particle filtering to construct a policy reprentation which doesn't require discretised observation spaces and avoids limiting the number of observations considered during planning. Experiments on three different problems involving continuous observation spaces indicate that LABECOP performs similar or better than state-of-the-art POMDP solvers.