LGMLAug 22, 2019

Opponent Aware Reinforcement Learning

arXiv:1908.08773v24 citations
AI Analysis

This work addresses the challenge of accounting for opponents in reinforcement learning, which is incremental as it extends existing MDP frameworks.

The paper tackles the problem of reinforcement learning in adversarial environments by introducing Threatened Markov Decision Processes (TMDPs) and a level-k thinking scheme, resulting in empirical benefits for the agent during learning.

We introduce Threatened Markov Decision Processes (TMDPs) as an extension of the classical Markov Decision Process framework for Reinforcement Learning (RL). TMDPs allow suporting a decision maker against potential opponents in a RL context. We also propose a level-k thinking scheme resulting in a novel learning approach to deal with TMDPs. After introducing our framework and deriving theoretical results, relevant empirical evidence is given via extensive experiments, showing the benefits of accounting for adversaries in RL while the agent learns

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes