When Critics Disagree: Adaptive Reward Poisoning Attacks in RIS-Aided Wireless Control System
For researchers and practitioners of DRL-based wireless systems, this work highlights a new vulnerability (disagreement-aware attacks) that must be considered for robustness.
The paper proposes a Disagreement-Guided Reward Poisoning (DGRP) attack on a Soft Actor-Critic agent in a RIS-aided wireless control system, which corrupts rewards when critics disagree, significantly degrading performance and transmission quality compared to baselines.
Reward-poisoning attacks present a significant risk to learning-based wireless control systems. Given this, we propose a Disagreement-Guided Reward Poisoning (DGRP) adaptive attack on a Soft Actor-Critic (SAC) agent. In a Cognitive Radio Network (CRN) environment assisted by Reconfigurable Intelligent Surfaces (RIS), the SAC agent is tasked with maximizing the long-term secondary users' (SUs) rate by simultaneously optimizing the transmission power of the SU transmitter and the RIS phase shifts. DGRP corrupts rewards, particularly when the SAC dual critics exhibit substantial disagreement-especially in high-leverage, high-uncertainty states-resulting in distorted value estimations and guiding the policy towards suboptimal actions. Our findings demonstrate that DGRP substantially diminishes the performance improvements typically provided by RIS and degrades transmission quality. We further investigate key attack parameters and determine their impact on learning. In comparison to periodic-timing and exploration-triggered baselines, DGRP consistently causes greater damage, highlighting the necessity of considering disagreement-aware threats when evaluating the robustness of Deep Reinforcement Learning (DRL) in RIS-assisted networks.