OCAIPRSep 17, 2013

Models and algorithms for skip-free Markov decision processes on trees

arXiv:1309.4291v22.32 citations
Originality Incremental advance
AI Analysis

This work provides a novel algorithmic approach for solving skip-free Markov decision processes, which could benefit researchers and practitioners in operations research and control theory, though it appears incremental as it builds on existing methods.

The authors tackled the problem of solving multidimensional control problems by introducing skip-free Markov decision processes on trees and developed an algorithm that combines the advantages of value and policy iteration, guaranteeing convergence to optimal policies and value functions in a finite number of iterations with computational effort comparable to value iteration.

We introduce a class of models for multidimensional control problems which we call skip-free Markov decision processes on trees. We describe and analyse an algorithm applicable to Markov decision processes of this type that are skip-free in the negative direction. Starting with the finite average cost case, we show that the algorithm combines the advantages of both value iteration and policy iteration -- it is guaranteed to converge to an optimal policy and optimal value function after a finite number of iterations but the computational effort required for each iteration step is comparable with that for value iteration. We show that the algorithm can also be used to solve discounted cost models and continuous time models, and that a suitably modified algorithm can be used to solve communicating models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes