LG AIMay 4

When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

arXiv:2605.163124.8

Predicted impact top 69% in LG · last 90 daysOriginality Incremental advance

AI Analysis

Identifies action availability as a distinct robustness surface in self-play RL, highlighting a new vulnerability for multi-agent systems.

The paper studies adversarial action removal in self-play RL, showing that learned masking causes substantially more damage than random masking or perturbation baselines across poker games and other domains, with no recovery under extended training.

We study adversarial action masking in self-play reinforcement learning: an attacker selectively removes legal actions from a victim's action set. Unlike observation or action perturbations, removal eliminates decision options before the agent acts. Across poker games scaling from 6 to 5,531 information states and two non-poker domains, learned masking causes substantially more damage than random masking and learned perturbation baselines. The attack persists across Q-learning, PPO, NFSP, neural NFSP, and DQN victims; transfers across agents; is amplified by self-play; and shows no recovery under extended masked training. Mechanistically, the adversary targets high-value decision points, captured by reach-weighted contingent action capacity (CAC$_w$) and a value-weighted refinement CAC$_v$. These results identify action availability as a distinct robustness surface in self-play RL.

View on arXiv PDF

Similar