AIMay 19

SimGym: A Framework for A/B Test Simulation in E-Commerce with Traffic-Grounded VLM Agents

arXiv:2605.1921993.3
Predicted impact top 15% in AI · last 90 daysOriginality Incremental advance
AI Analysis

For e-commerce platforms, it enables rapid, risk-free A/B test simulation without diverting real traffic.

SimGym simulates A/B tests on e-commerce storefronts using VLM agents, achieving 77% directional alignment with real add-to-cart shifts while reducing experimental cycles from weeks to under an hour.

A/B testing remains the gold standard for evaluating modifications to e-commerce storefronts, yet it diverts traffic, requires weeks to reach statistical significance, and risks degrading user experience. We present SimGym, a framework for simulating A/B tests on e-commerce storefronts using vision-language model (VLM) agents operating in a live browser. The framework comprises three key components: (a) a traffic-grounded persona generation pipeline that derives per-shop buyer archetypes and intents from production clickstream data; (b) a live-browser agent architecture that combines multimodal perception over visual and browser-structured observations with episodic memory and guardrails to conduct coherent shopping sessions across control and treatment storefronts; and (c) an evaluation protocol that compares simulated outcome shifts with observed shifts in real buyer behavior. We validate SimGym on A/B tests of visually driven UI theme changes from a major e-commerce platform across diverse storefronts and product categories. Empirical results show that SimGym agents achieve strong agreement with observed outcome shifts, attaining 77% directional alignment with add-to-cart shifts observed across interface variants in real-buyer traffic. It reduces experimental cycles from weeks to under an hour, enabling rapid experimentation without exposing real buyers to candidate variants.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes