AIJul 6, 2025

WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis

arXiv:2507.04370v18 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses scalability and cost issues for researchers and developers training web agents, though it is incremental as it builds on existing world model and planning methods.

The paper tackles the problem of inefficient and costly trajectory generation for web agents by proposing WebSynthesis, a framework that uses a learned world model and tree-based planning to synthesize trajectories, achieving performance comparable to or better than models trained on large-scale real-world data with reduced API costs.

Recent advancements in large language models (LLMs) have significantly improved the capabilities of web agents. However, effectively navigating complex and dynamic web environments still requires more advanced trajectory-level planning and execution. Prior studies have addressed self-improving agents by collecting extensive GUI trajectories from real-environment interactions. Despite their effectiveness, these approaches encounter two critical challenges: (1) Uncontrollable environment states, where real or sandboxed web environments often yield unstable and non-deterministic feedback, complicating the reproduction and debugging of agent behaviors; and (2) High API costs, as generating even a single interaction trajectory can involve hundreds of queries, leading to considerable API usage and computational expenses. To address these limitations and enable scalable self-improvement for agents, we propose WebSynthesis, a novel framework for trajectory synthesis and training. WebSynthesis leverages a learned world model to simulate virtual web environments, allowing a policy agent to perform efficient and reversible tree-based planning. This approach supports the large-scale generation of diverse and high-quality trajectories, which are subsequently utilized to refine the agent's policy. Experimental results demonstrate that an agent trained using WebSynthesis on a small-scale synthetic dataset achieves performance comparable to or even surpassing that of models trained on large-scale real-world data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes