LODSSYSYJul 21, 2014

Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes

arXiv:1407.53964 citationsh-index: 46Has Code
Originality Incremental advance
AI Analysis

This work addresses the scalability problem of strategy synthesis in large MDPs for researchers in automated planning and formal verification, offering an alternative to existing symblicit methods.

The authors propose symblicit algorithms using pseudo-antichains for optimal strategy synthesis in monotonic Markov decision processes, achieving promising experimental results in run time and memory consumption for automated planning and LTL synthesis applications.

When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff. This algorithm, based on the strategy iteration algorithm of Howard and Veinott, efficiently combines symbolic and explicit data structures, and uses binary decision diagrams as symbolic representation. The aim of this paper is to show that the new data structure of pseudo-antichains (an extension of antichains) provides another interesting alternative, especially for the class of monotonic MDPs. We design efficient pseudo-antichain based symblicit algorithms (with open source implementations) for two quantitative settings: the expected mean-payoff and the stochastic shortest path. For two practical applications coming from automated planning and LTL synthesis, we report promising experimental results w.r.t. both the run time and the memory consumption.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes