LGAICRMLSep 5, 2018

Reinforcement Learning under Threats

arXiv:1809.01560v230 citations
AI Analysis

This addresses security issues in RL for decision-makers, but it appears incremental as it builds on existing adversarial frameworks.

The paper tackles the problem of adversaries interfering with reward generation in reinforcement learning by introducing Threatened Markov Decision Processes (TMDPs) and a level-k thinking scheme, showing benefits through extensive experiments.

In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process. In this paper, we introduce Threatened Markov Decision Processes (TMDPs), which provide a framework to support a decision maker against a potential adversary in RL. Furthermore, we propose a level-$k$ thinking scheme resulting in a new learning framework to deal with TMDPs. After introducing our framework and deriving theoretical results, relevant empirical evidence is given via extensive experiments, showing the benefits of accounting for adversaries while the agent learns.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes