SE AIMar 22, 2025

A Study on the Improvement of Code Generation Quality Using Large Language Models Leveraging Product Documentation

arXiv:2503.17837v11 citationsh-index: 1

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving software quality through better E2E testing for developers, though it is incremental as it builds on existing LLM methods for a specific bottleneck.

The study tackled the problem of limited automated generation of end-to-end (E2E) test code by proposing a method using large language models (LLMs) with tailored prompts to generate E2E test code from product documentation, resulting in high compilation success and functional coverage that outperformed tests based on requirement specs and user stories.

Research on using Large Language Models (LLMs) in system development is expanding, especially in automated code and test generation. While E2E testing is vital for ensuring application quality, most test generation research has focused on unit tests, with limited work on E2E test code. This study proposes a method for automatically generating E2E test code from product documentation such as manuals, FAQs, and tutorials using LLMs with tailored prompts. The two step process interprets documentation intent and produces executable test code. Experiments on a web app with six key features (e.g., authentication, profile, discussion) showed that tests generated from product documentation had high compilation success and functional coverage, outperforming those based on requirement specs and user stories. These findings highlight the potential of product documentation to improve E2E test quality and, by extension, software quality.

View on arXiv PDF

Similar