Efficient Algorithms for Robust Markov Decision Processes with $s$-Rectangular Ambiguity Sets
This work addresses the need for faster computational methods in robust MDPs, which are crucial for decision-making under uncertainty in fields like operations research and AI, representing a strong specific gain rather than a foundational breakthrough.
The paper tackles the problem of solving robust Markov decision processes with s-rectangular ambiguity sets, developing efficient algorithms that achieve speedups of several orders of magnitude compared to state-of-the-art solvers, often only a logarithmic factor slower than classical MDPs.
Robust Markov decision processes (MDPs) have attracted significant interest due to their ability to protect MDPs from poor out-of-sample performance in the presence of ambiguity. In contrast to classical MDPs, which account for stochasticity by modeling the dynamics through a stochastic process with a known transition kernel, a robust MDP additionally accounts for ambiguity by optimizing against the most adverse transition kernel from an ambiguity set constructed via historical data. In this paper, we develop a unified solution framework for a broad class of robust MDPs with $s$-rectangular ambiguity sets, where the most adverse transition probabilities are considered independently for each state. Using our algorithms, we show that $s$-rectangular robust MDPs with $1$- and $2$-norm as well as $φ$-divergence ambiguity sets can be solved several orders of magnitude faster than with state-of-the-art commercial solvers, and often only a logarithmic factor slower than classical MDPs. We demonstrate the favorable scaling properties of our algorithms on a range of synthetically generated as well as standard benchmark instances.