Robust Phi-Divergence MDPs
This work addresses uncertainty in dynamic decision-making for applications like robotics or finance, offering a more efficient alternative to classical MDPs, though it is incremental as it builds on existing robust MDP frameworks.
The paper tackles the problem of solving robust Markov decision processes (MDPs) with s-rectangular ambiguity sets by developing a novel framework that decomposes it into robust Bellman updates and simplex projections, resulting in substantially faster solution times compared to state-of-the-art solvers and a recent first-order scheme.
In recent years, robust Markov decision processes (MDPs) have emerged as a prominent modeling framework for dynamic decision problems affected by uncertainty. In contrast to classical MDPs, which only account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, robust MDPs additionally account for ambiguity by optimizing in view of the most adverse transition kernel from a prescribed ambiguity set. In this paper, we develop a novel solution framework for robust MDPs with s-rectangular ambiguity sets that decomposes the problem into a sequence of robust Bellman updates and simplex projections. Exploiting the rich structure present in the simplex projections corresponding to phi-divergence ambiguity sets, we show that the associated s-rectangular robust MDPs can be solved substantially faster than with state-of-the-art commercial solvers as well as a recent first-order solution scheme, thus rendering them attractive alternatives to classical MDPs in practical applications.