LGAIMar 29, 2021

Robust Reinforcement Learning under model misspecification

arXiv:2103.15370v12 citations
Originality Incremental advance
AI Analysis

This addresses robustness issues for RL applications in real-world control, but appears incremental as it builds on existing methods like POMDPs and adversarial training.

The paper tackles the problem of model misspecification in reinforcement learning, where agents face different transition dynamics during training and deployment, and proposes a framework using history trajectory and POMDP modeling with adversarial attacks, achieving validated effectiveness in four gym domains.

Reinforcement learning has achieved remarkable performance in a wide range of tasks these days. Nevertheless, some unsolved problems limit its applications in real-world control. One of them is model misspecification, a situation where an agent is trained and deployed in environments with different transition dynamics. We propose an novel framework that utilize history trajectory and Partial Observable Markov Decision Process Modeling to deal with this dilemma. Additionally, we put forward an efficient adversarial attack method to assist robust training. Our experiments in four gym domains validate the effectiveness of our framework.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes