CR AIJun 8, 2025

Mind the Web: The Security of Web Use Agents

Avishag Shapira, Parth Atulbhai Gandhi, Edan Habler, Asaf Shabtai

arXiv:2506.07153v28.65 citationsh-index: 46

Originality Highly original

AI Analysis

This exposes a critical security risk for deployed web automation systems, requiring new defenses.

The paper demonstrates that web-use agents are vulnerable to malicious content embedded in web pages, which can steer them away from their original tasks using a task-aligned injection technique, achieving over 80% attack success rate across various agents and payloads.

Web-use agents are rapidly being deployed to automate complex web tasks with extensive browser capabilities. However, these capabilities create a critical and previously unexplored attack surface. This paper demonstrates how attackers can exploit web-use agents by embedding malicious content in web pages, such as comments, reviews, or advertisements, that agents encounter during legitimate browsing tasks. We introduce the task-aligned injection technique that frames malicious commands as helpful task guidance rather than obvious attacks, exploiting fundamental limitations in LLMs' contextual reasoning. Agents struggle to maintain coherent contextual awareness and fail to detect when seemingly helpful web content contains steering attempts that deviate them from their original task goal. To scale this attack, we developed an automated three-stage pipeline that generates effective injections without manual annotation or costly online agent interactions during training, remaining efficient even with limited training data. This pipeline produces a generator model that we evaluate on five popular agents using payloads organized by the Confidentiality-Integrity-Availability (CIA) security triad, including unauthorized camera activation, file exfiltration, user impersonation, phishing, and denial-of-service. This generator achieves over 80% attack success rate (ASR) with strong transferability across unseen payloads, diverse web environments, and different underlying LLMs. This attack succeed even against agents with built-in safety mechanisms, requiring only the ability to post content on public websites. To address this risk, we propose comprehensive mitigation strategies including oversight mechanisms, execution constraints, and task-aware reasoning techniques.

View on arXiv PDF

Similar