AIFeb 1

SimGym: Traffic-Grounded Browser Agents for Offline A/B Testing in E-Commerce

arXiv:2602.01443v11 citations
Originality Highly original
AI Analysis

This addresses the problem of rapid and safe UI testing for e-commerce platforms, offering a scalable offline alternative to traditional A/B testing.

The paper tackles the slow and risky nature of A/B testing in e-commerce by introducing SimGym, a system that uses LLM agents in a browser to simulate buyer behavior, reducing experiment cycles from weeks to under an hour and achieving state-of-the-art alignment with real outcomes.

A/B testing remains the gold standard for evaluating e-commerce UI changes, yet it diverts traffic, takes weeks to reach significance, and risks harming user experience. We introduce SimGym, a scalable system for rapid offline A/B testing using traffic-grounded synthetic buyers powered by Large Language Model agents operating in a live browser. SimGym extracts per-shop buyer profiles and intents from production interaction data, identifies distinct behavioral archetypes, and simulates cohort-weighted sessions across control and treatment storefronts. We validate SimGym against real human outcomes from real UI changes on a major e-commerce platform under confounder control. Even without alignment post training, SimGym agents achieve state of the art alignment with observed outcome shifts and reduces experiment cycles from weeks to under an hour , enabling rapid experimentation without exposure to real buyers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes