SEApr 16

Enhancing Large Language Models with Retrieval Augmented Generation for Software Testing and Inspection Automation

Zoe Fingleton, Nazanin Siavash, Armin Moin

arXiv:2604.1527042.2h-index: 1

Predicted impact top 60% in SE · last 90 daysOriginality Synthesis-oriented

AI Analysis

For software engineers, this work offers a practical method to improve LLM-based V&V automation, but the gains are incremental over existing LLM approaches.

This paper applies Retrieval Augmented Generation (RAG) to enhance LLMs for automated test case generation and code inspection, reducing hallucination and improving effectiveness. Experiments show positive impact on both tasks, saving human time and improving V&V efficiency.

In this paper, we focus on automating two of the widely used Verification and Validation (V&V) activities in the Software Development Lifecycle (SDLC): Software testing and software inspection (also known as review). Concerning the former, we concentrate on automated test case generation using Large Language Models (LLMs). For the latter, we enable inspection of the source code by LLMs. To address the known LLM hallucination problem, in which LLMs confidently produce incorrect outputs, we implement a Retrieval Augmented Generation (RAG) pipeline to integrate supplementary knowledge sources and provide additional context to the LLM. Our experimental results indicate that incorporating external context via the RAG pipeline has a generally positive impact on both test case generation and code inspection. This novel approach reduces the total project cost by saving human testers'/inspectors' time. It also improves the effectiveness and efficiency of these V&V activities, as evidenced by our experimental study.

View on arXiv PDF

Similar