CEMay 9

AutoRedTrader: Autonomous Red Teaming of Trading Agents through Synthetic Misinformation Injection

arXiv:2605.0918597.6
AI Analysis

For developers of LLM-based financial agents, this work provides a systematic method to evaluate vulnerability to subtle misinformation, though the attack success rate is modest.

AutoRedTrader is an autonomous red-teaming framework that generates subtle financial misinformation to test LLM-based trading agents, achieving a 69.00% misinformation exposure rate and 26.67% attack success rate on Bitcoin transaction data, outperforming baselines.

LLM-based financial agents increasingly rely on both numerical market data and textual signals for sequential trading and stock prediction. However, financial misinformation often appears as subtle textual perturbations rather than explicit falsehoods, making it difficult to detect while still capable of significantly altering agent reasoning and decisions. To study this risk, we propose AutoRedTrader, an autonomous red-teaming framework that generates finance-specific misinformation through behavioral bias manipulation, minor textual perturbations, and rewriting strategies, with agent feedback used to strengthen attacks over time. We evaluate AutoRedTrader in a POMDP-based financial agent simulation environment, and further examine a time-series-informed grounding setting for robustness analysis. The framework enables systematic evaluation of how subtle misinformation affects financial agents and whether historical market evidence can stabilize decisions under misleading textual signals. We evaluate the framework on Bitcoin transaction data. The results show that AutoRedTrader achieves the strongest attack performance with 69.00% misinformation exposure rate and 26.67% attack success rate, outperforming general-purpose misinformation and red-teaming baselines. Ablation studies further show that all modules contribute to generating retrievable and decision-effective financial misinformation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes