CRAIMar 3, 2025

Adversarial Agents: Black-Box Evasion Attacks with Reinforcement Learning

arXiv:2503.01734v25 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work presents a new attack vector for adversarial machine learning, enabling more efficient and scalable attacks on ML models, which is incremental as it builds on existing RL and AML methods.

The paper tackled the problem of generating adversarial samples for machine learning models by introducing a reinforcement learning (RL) agent that learns attack strategies, resulting in up to a 13.2% increase in attack success rate and a 16.9% reduction in model queries compared to initial training.

Attacks on machine learning models have been extensively studied through stateless optimization. In this paper, we demonstrate how a reinforcement learning (RL) agent can learn a new class of attack algorithms that generate adversarial samples. Unlike traditional adversarial machine learning (AML) methods that craft adversarial samples independently, our RL-based approach retains and exploits past attack experience to improve the effectiveness and efficiency of future attacks. We formulate adversarial sample generation as a Markov Decision Process and evaluate RL's ability to (a) learn effective and efficient attack strategies and (b) compete with state-of-the-art AML. On two image classification benchmarks, our agent increases attack success rate by up to 13.2% and decreases the average number of victim model queries per attack by up to 16.9% from the start to the end of training. In a head-to-head comparison with state-of-the-art image attacks, our approach enables an adversary to generate adversarial samples with 17% more success on unseen inputs post-training. From a security perspective, this work demonstrates a powerful new attack vector that uses RL to train agents that attack ML models efficiently and at scale.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes