LO AIJun 30, 2017

Tableaux for Policy Synthesis for MDPs with PCTL* Constraints

Peter Baumgartner, Sylvie Thiébaux, Felipe Trevizan

arXiv:1706.10102v31.22 citations

Originality Highly original

AI Analysis

This work addresses a gap in policy synthesis for MDPs with expressive PCTL* specifications, which is incremental as it builds on existing frameworks by introducing a new algorithm for a previously unsolved problem.

The paper tackles the policy synthesis problem for Markov decision processes (MDPs) with PCTL* constraints, which was previously undecidable, by restricting to finitely bounded memory policies and develops a tableau-based algorithm that generates a system of equalities and inequalities to derive stochastic policies, with the main result proving the correctness of the method in terms of soundness, completeness, and termination.

Markov decision processes (MDPs) are the standard formalism for modelling sequential decision making in stochastic environments. Policy synthesis addresses the problem of how to control or limit the decisions an agent makes so that a given specification is met. In this paper we consider PCTL*, the probabilistic counterpart of CTL*, as the specification language. Because in general the policy synthesis problem for PCTL* is undecidable, we restrict to policies whose execution history memory is finitely bounded a priori. Surprisingly, no algorithm for policy synthesis for this natural and expressive framework has been developed so far. We close this gap and describe a tableau-based algorithm that, given an MDP and a PCTL* specification, derives in a non-deterministic way a system of (possibly nonlinear) equalities and inequalities. The solutions of this system, if any, describe the desired (stochastic) policies. Our main result in this paper is the correctness of our method, i.e., soundness, completeness and termination.

View on arXiv PDF

Similar