AIDec 23, 2022

Online Planning for Constrained POMDPs with Continuous Spaces through Dual Ascent

arXiv:2212.12154v110 citationsh-index: 23
Originality Incremental advance
AI Analysis

This addresses safety-critical planning in robotics and autonomous systems where hard constraints must be enforced, representing a domain-specific incremental advance.

The paper tackles online planning for Constrained POMDPs with continuous spaces by combining dual ascent with progressive widening, achieving effective solutions for safety-critical problems and outperforming reward-scalarization methods.

Rather than augmenting rewards with penalties for undesired behavior, Constrained Partially Observable Markov Decision Processes (CPOMDPs) plan safely by imposing inviolable hard constraint value budgets. Previous work performing online planning for CPOMDPs has only been applied to discrete action and observation spaces. In this work, we propose algorithms for online CPOMDP planning for continuous state, action, and observation spaces by combining dual ascent with progressive widening. We empirically compare the effectiveness of our proposed algorithms on continuous CPOMDPs that model both toy and real-world safety-critical problems. Additionally, we compare against the use of online solvers for continuous unconstrained POMDPs that scalarize cost constraints into rewards, and investigate the effect of optimistic cost propagation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes