LGAIJan 7, 2025

Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective

arXiv:2501.03562v215 citationsh-index: 15ICASSP
AI Analysis

This addresses robustness evaluation for reinforcement learning agents, particularly in continuous action spaces, but is incremental as it builds on existing adversarial attack methods.

The paper tackles the limited impact of existing adversarial attacks on reinforcement learning agents by proposing DAPGD, which attacks the policy distribution instead of individual actions, resulting in a 22.03% higher average reward drop compared to baselines in robot navigation tasks.

Deep Reinforcement Learning (DRL) suffers from uncertainties and inaccuracies in the observation signal in realworld applications. Adversarial attack is an effective method for evaluating the robustness of DRL agents. However, existing attack methods targeting individual sampled actions have limited impacts on the overall policy distribution, particularly in continuous action spaces. To address these limitations, we propose the Distribution-Aware Projected Gradient Descent attack (DAPGD). DAPGD uses distribution similarity as the gradient perturbation input to attack the policy network, which leverages the entire policy distribution rather than relying on individual samples. We utilize the Bhattacharyya distance in DAPGD to measure policy similarity, enabling sensitive detection of subtle but critical differences between probability distributions. Our experiment results demonstrate that DAPGD achieves SOTA results compared to the baselines in three robot navigation tasks, achieving an average 22.03% higher reward drop compared to the best baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes