AINov 28, 2022

Shielding in Resource-Constrained Goal POMDPs

arXiv:2211.15349v14 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This work addresses resource management in autonomous systems, offering a safety-critical solution for domains like robotics, but it is incremental as it builds on existing POMDP planning methods.

The paper tackled the problem of preventing resource exhaustion in partially observable Markov decision processes (POMDPs) by developing a shield-based algorithm that minimizes expected cost while ensuring safe operation, with experiments demonstrating applicability to existing benchmarks.

We consider partially observable Markov decision processes (POMDPs) modeling an agent that needs a supply of a certain resource (e.g., electricity stored in batteries) to operate correctly. The resource is consumed by agent's actions and can be replenished only in certain states. The agent aims to minimize the expected cost of reaching some goal while preventing resource exhaustion, a problem we call \emph{resource-constrained goal optimization} (RSGO). We take a two-step approach to the RSGO problem. First, using formal methods techniques, we design an algorithm computing a \emph{shield} for a given scenario: a procedure that observes the agent and prevents it from using actions that might eventually lead to resource exhaustion. Second, we augment the POMCP heuristic search algorithm for POMDP planning with our shields to obtain an algorithm solving the RSGO problem. We implement our algorithm and present experiments showing its applicability to benchmarks from the literature.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes