OC AI LG PRJul 5, 2024

Q-Learning under Finite Model Uncertainty

arXiv:2407.04259v35.61 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This work addresses robust reinforcement learning for scenarios with finite ambiguity sets, offering a flexible approach beyond common formulations like KL and Wasserstein balls, though it appears incremental in extending existing robust methods.

The authors tackled the problem of robust Q-learning in Markov decision processes with finite model uncertainty, proposing an algorithm that converges to the robust optimum and provides non-asymptotic error bounds separating stochastic approximation from transition-kernel estimation errors.

We propose a robust Q-learning algorithm for Markov decision processes under model uncertainty when each state-action pair is associated with a finite ambiguity set of candidate transition kernels. This finite-measure framework enables highly flexible, user-designed uncertainty models and goes beyond the common KL and Wasserstein ball formulations. We establish almost sure convergence of the learned Q-function to the robust optimum, and derive non-asymptotic high-probability error bounds that separate stochastic approximation error from transition-kernel estimation error. Finally, we show that Wasserstein ball and parametric ambiguity sets can be approximated by finite ambiguity sets, allowing our algorithm to be used as a generic solver beyond the finite setting.

View on arXiv PDF Code

Similar