AI LGAug 16, 2024

Pessimistic Iterative Planning with RNNs for Robust POMDPs

Maris F. L. Galesloot, Marnix Suilen, Thiago D. Simão, Steven Carr, Matthijs T. J. Spaan, Ufuk Topcu, Nils Jansen

arXiv:2408.08770v49.66 citationsh-index: 53

Originality Incremental advance

AI Analysis

This addresses robust decision-making under uncertainty in partially observable environments, such as robotics or autonomous systems, with incremental improvements over existing methods.

The paper tackles the problem of computing robust memory-based policies for robust POMDPs under model uncertainty by proposing the pessimistic iterative planning (PIP) framework with the rFSCNet algorithm, which uses recurrent neural networks to optimize finite-state controllers, resulting in better-performing policies than baselines and a state-of-the-art solver.

Robust POMDPs extend classical POMDPs to incorporate model uncertainty using so-called uncertainty sets on the transition and observation functions, effectively defining ranges of probabilities. Policies for robust POMDPs must be (1) memory-based to account for partial observability and (2) robust against model uncertainty to account for the worst-case probability instances from the uncertainty sets. To compute such robust memory-based policies, we propose the pessimistic iterative planning (PIP) framework, which alternates between (1) selecting pessimistic POMDPs via worst-case probability instances from the uncertainty sets, and (2) computing finite-state controllers (FSCs) for these pessimistic POMDPs. Within PIP, we propose the rFSCNet algorithm, which optimizes a recurrent neural network to compute the FSCs. The empirical evaluation shows that rFSCNet can compute better-performing robust policies than several baselines and a state-of-the-art robust POMDP solver.

View on arXiv PDF

Similar