AIHCJan 29

How do Visual Attributes Influence Web Agents? A Comprehensive Evaluation of User Interface Design Factors

arXiv:2601.21961v23 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses a gap in understanding visual factors for web agents, which is incremental as it builds on prior textual studies but provides new systematic insights for AI and web interaction research.

The paper tackled the problem of how visual attributes influence web agents' decision-making in benign scenarios, introducing the VAF pipeline to quantify these effects, and found that background color contrast, item size, position, and card clarity strongly influence agents, while font styling, text color, and item image clarity have minor effects.

Web agents have demonstrated strong performance on a wide range of web-based tasks. However, existing research on the effect of environmental variation has mostly focused on robustness to adversarial attacks, with less attention to agents' preferences in benign scenarios. Although early studies have examined how textual attributes influence agent behavior, a systematic understanding of how visual attributes shape agent decision-making remains limited. To address this, we introduce VAF, a controlled evaluation pipeline for quantifying how webpage Visual Attribute Factors influence web-agent decision-making. Specifically, VAF consists of three stages: (i) variant generation, which ensures the variants share identical semantics as the original item while only differ in visual attributes; (ii) browsing interaction, where agents navigate the page via scrolling and clicking the interested item, mirroring how human users browse online; (iii) validating through both click action and reasoning from agents, which we use the Target Click Rate and Target Mention Rate to jointly evaluate the effect of visual attributes. By quantitatively measuring the decision-making difference between the original and variant, we identify which visual attributes influence agents' behavior most. Extensive experiments, across 8 variant families (48 variants total), 5 real-world websites (including shopping, travel, and news browsing), and 4 representative web agents, show that background color contrast, item size, position, and card clarity have a strong influence on agents' actions, whereas font styling, text color, and item image clarity exhibit minor effects.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes