Whispers of Wealth: Red-Teaming Google's Agent Payments Protocol via Prompt Injection
This work addresses security risks in LLM-based financial transaction systems, highlighting critical weaknesses in agentic payment architectures for developers and users, though it is incremental as it builds on existing red-teaming and prompt injection research.
The paper red-teamed Google's Agent Payments Protocol (AP2) and found vulnerabilities to prompt injection attacks, such as the Branded Whisper Attack and Vault Whisper Attack, which can manipulate product rankings and extract sensitive user data, demonstrating that simple adversarial prompts reliably subvert agent behavior.
Large language model (LLM) based agents are increasingly used to automate financial transactions, yet their reliance on contextual reasoning exposes payment systems to prompt-driven manipulation. The Agent Payments Protocol (AP2) aims to secure agent-led purchases through cryptographically verifiable mandates, but its practical robustness remains underexplored. In this work, we perform an AI red-teaming evaluation of AP2 and identify vulnerabilities arising from indirect and direct prompt injection. We introduce two attack techniques, the Branded Whisper Attack and the Vault Whisper Attack which manipulate product ranking and extract sensitive user data. Using a functional AP2 based shopping agent built with Gemini-2.5-Flash and the Google ADK framework, we experimentally validate that simple adversarial prompts can reliably subvert agent behavior. Our findings reveal critical weaknesses in current agentic payment architectures and highlight the need for stronger isolation and defensive safeguards in LLM-mediated financial systems.