LOAIOct 24, 2019

Simple Strategies in Multi-Objective MDPs (Technical Report)

arXiv:1910.11024v330 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of multi-objective decision-making in MDPs for researchers and practitioners, but it is incremental as it builds on existing verification methods.

The paper tackles the problem of verifying multiple expected reward objectives in Markov decision processes to enable trade-off analysis via Pareto fronts, focusing on pure stationary and bounded memory strategies, and shows that checking achievability for pure stationary strategies is NP-complete, with experimental results demonstrating algorithm feasibility.

We consider the verification of multiple expected reward objectives at once on Markov decision processes (MDPs). This enables a trade-off analysis among multiple objectives by obtaining the Pareto front. We focus on strategies that are easy to employ and implement. That is, strategies that are pure (no randomization) and have bounded memory. We show that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and we provide an MILP encoding to solve the corresponding problem. The bounded memory case can be reduced to the stationary one by a product construction. Experimental results using \Storm and Gurobi show the feasibility of our algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes