LGCRJun 14, 2021

Poisoning Deep Reinforcement Learning Agents with In-Distribution Triggers

arXiv:2106.07798v133 citations
Originality Incremental advance
AI Analysis

This addresses security vulnerabilities in deep learning models, particularly for reinforcement learning applications, but is incremental as it builds on existing poisoning and multi-task learning methods.

The paper tackles the problem of data poisoning attacks on deep reinforcement learning agents by introducing in-distribution triggers, achieving successful attacks in three common environments.

In this paper, we propose a new data poisoning attack and apply it to deep reinforcement learning agents. Our attack centers on what we call in-distribution triggers, which are triggers native to the data distributions the model will be trained on and deployed in. We outline a simple procedure for embedding these, and other, triggers in deep reinforcement learning agents following a multi-task learning paradigm, and demonstrate in three common reinforcement learning environments. We believe that this work has important implications for the security of deep learning models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes