LGAIMLJul 30, 2019

Wasserstein Robust Reinforcement Learning

arXiv:1907.13196v492 citations
Originality Highly original
AI Analysis

This work addresses the robustness issue in reinforcement learning for real-world applications, representing an incremental improvement with a novel formulation and solver.

The paper tackles the problem of reinforcement learning algorithms overfitting to training environments by proposing WR2L, a robust reinforcement learning algorithm that achieves significant robust performance on low and high-dimensional control tasks, with empirical gains demonstrated on MuJuCo environments.

Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes $\text{W}\text{R}^{2}\text{L}$ -- a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a Wasserstein constraint for a correct and convergent solver. Apart from the formulation, we also propose an efficient and scalable solver following a novel zero-order optimisation method that we believe can be useful to numerical optimisation in general. We empirically demonstrate significant gains compared to standard and robust state-of-the-art algorithms on high-dimensional MuJuCo environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes