AIAug 22, 2013

POMDPs under Probabilistic Semantics

arXiv:1308.4846v116 citations

Originality Incremental advance

AI Analysis

This addresses theoretical challenges in controller synthesis for POMDPs with path constraints, which is incremental as it extends prior work on decision problems in this domain.

The paper tackles the problem of computing almost-sure winning sets in partially observable Markov decision processes (POMDPs) with limit-average payoff under probabilistic semantics, showing that for qualitative constraints, finite-memory controller existence is EXPTIME-complete and infinite-memory controller existence is undecidable, while for quantitative constraints, finite-memory controller existence is undecidable.

We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated to every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) quantitative constraint defines the set of paths where the payoff is at least a given threshold λ in (0, 1]; and (ii) qualitative constraint which is a special case of quantitative constraint with λ = 1. We consider the computation of the almost-sure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraint are as follows: (i) the problem of deciding the existence of a finite-memory controller is EXPTIME-complete; and (ii) the problem of deciding the existence of an infinite-memory controller is undecidable. For quantitative path constraint we show that the problem of deciding the existence of a finite-memory controller is undecidable.

View on arXiv PDF

Similar