SEAINov 28, 2025

Autonomous QA Agent: A Retrieval-Augmented Framework for Reliable Selenium Script Generation

arXiv:2601.06034v1
Originality Incremental advance
AI Analysis

This addresses the manual and error-prone process of creating test scripts for software testing, though it is incremental as it applies RAG to a specific domain.

The paper tackles the problem of generating reliable Selenium test scripts from requirements by addressing LLM hallucinations of UI elements, achieving 100% syntax validity and 90% execution success in e-commerce scenarios compared to 30% for standard LLM generation.

Software testing is critical in the software development lifecycle, yet translating requirements into executable test scripts remains manual and error-prone. While Large Language Models (LLMs) can generate code, they often hallucinate non-existent UI elements. We present the Autonomous QA Agent, a Retrieval-Augmented Generation (RAG) system that grounds Selenium script generation in project-specific documentation and HTML structure. By ingesting diverse formats (Markdown, PDF, HTML) into a vector database, our system retrieves relevant context before generation. Evaluation on 20 e-commerce test scenarios shows our RAG approach achieves 100% (20/20) syntax validity and 90% (18/20, 95% CI: [85%, 95%], p < 0.001) execution success, compared to 30% for standard LLM generation. While our evaluation is limited to a single domain, our method significantly reduces hallucinations by grounding generation in actual DOM structure, demonstrating RAG's potential for automated UI testing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes