LGAIOCPRMLSep 30, 2022

Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

arXiv:2210.00898v35 citations
Originality Highly original
AI Analysis

This work addresses the challenge of robust stochastic optimal control for applications where estimated transition probabilities may be inaccurate, offering a practical solution for domains requiring reliability under uncertainty.

The authors tackled the problem of solving distributionally robust Markov decision processes under Wasserstein uncertainty by developing a novel Q-learning algorithm, proving its convergence and demonstrating its tractability and benefits in handling misspecified distributions through examples with real data.

We present a novel $Q$-learning algorithm tailored to solve distributionally robust Markov decision problems where the corresponding ambiguity set of transition probabilities for the underlying Markov decision process is a Wasserstein ball around a (possibly estimated) reference measure. We prove convergence of the presented algorithm and provide several examples also using real data to illustrate both the tractability of our algorithm as well as the benefits of considering distributional robustness when solving stochastic optimal control problems, in particular when the estimated distributions turn out to be misspecified in practice.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes