LG CR MLMay 29, 2019

CopyCAT: Taking Control of Neural Policies with Constant Attacks

Léonard Hussenot, Matthieu Geist, Olivier Pietquin

arXiv:1905.12282v210.734 citations

Originality Incremental advance

AI Analysis

This addresses security vulnerabilities in reinforcement learning systems for applications like autonomous agents, though it is incremental by focusing on a novel read-only attack scenario.

The paper tackles the problem of adversarial attacks on deep reinforcement learning agents by introducing CopyCAT, a targeted attack that lures agents into following an outsider's policy in a read-only setting, achieving effectiveness on Atari 2600 games with pre-computed, fast inference for real-time use.

We propose a new perspective on adversarial attacks against deep reinforcement learning agents. Our main contribution is CopyCAT, a targeted attack able to consistently lure an agent into following an outsider's policy. It is pre-computed, therefore fast inferred, and could thus be usable in a real-time scenario. We show its effectiveness on Atari 2600 games in the novel read-only setting. In this setting, the adversary cannot directly modify the agent's state -- its representation of the environment -- but can only attack the agent's observation -- its perception of the environment. Directly modifying the agent's state would require a write-access to the agent's inner workings and we argue that this assumption is too strong in realistic settings.

View on arXiv PDF

Similar