LGMLMay 17, 2019

Stochastically Dominant Distributional Reinforcement Learning

arXiv:1905.07318v426 citations
Originality Highly original
AI Analysis

This addresses uncertainty management in reinforcement learning for applications requiring robust decision-making, representing a novel method for a known bottleneck.

The paper tackles the problem of managing aleatoric uncertainty in reinforcement learning by proposing a distributional method based on second-order stochastic dominance (SSD) to compare random returns, resulting in a more robust policy that better balances uncertainty and performance compared to other risk measures.

We describe a new approach for managing aleatoric uncertainty in the Reinforcement Learning (RL) paradigm. Instead of selecting actions according to a single statistic, we propose a distributional method based on the second-order stochastic dominance (SSD) relation. This compares the inherent dispersion of random returns induced by actions, producing a more comprehensive and robust evaluation of the environment's uncertainty. The necessary conditions for SSD require estimators to predict accurate second moments. To accommodate this, we map the distributional RL problem to a Wasserstein gradient flow, treating the distributional Bellman residual as a potential energy functional. We propose a particle-based algorithm for which we prove optimality and convergence. Our experiments characterize the algorithm performance and demonstrate how uncertainty and performance are better balanced using an \textsc{ssd} policy than with other risk measures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes