LSPRAG: LSP-Guided RAG for Language-Agnostic Real-Time Unit Test Generation
This addresses the challenge for software developers needing efficient, language-agnostic test generation, though it is incremental by building on existing LSP and RAG methods.
The paper tackled the problem of automated unit test generation across multiple programming languages in real-time development by introducing LSPRAG, a framework that uses Language Server Protocol back-ends to provide precise context to LLMs, resulting in line coverage increases of up to 174.55% for Golang, 213.31% for Java, and 31.57% for Python compared to baselines.
Automated unit test generation is essential for robust software development, yet existing approaches struggle to generalize across multiple programming languages and operate within real-time development. While Large Language Models (LLMs) offer a promising solution, their ability to generate high coverage test code depends on prompting a concise context of the focal method. Current solutions, such as Retrieval-Augmented Generation, either rely on imprecise similarity-based searches or demand the creation of costly, language-specific static analysis pipelines. To address this gap, we present LSPRAG, a framework for concise-context retrieval tailored for real-time, language-agnostic unit test generation. LSPRAG leverages off-the-shelf Language Server Protocol (LSP) back-ends to supply LLMs with precise symbol definitions and references in real time. By reusing mature LSP servers, LSPRAG provides an LLM with language-aware context retrieval, requiring minimal per-language engineering effort. We evaluated LSPRAG on open-source projects spanning Java, Go, and Python. Compared to the best performance of baselines, LSPRAG increased line coverage by up to 174.55% for Golang, 213.31% for Java, and 31.57% for Python.