SEMar 25

Fixturize: Bridging the Fixture Gap in Test Generation

Chengyi Wang, Pengyu Xue, Zhen Yang, Xiapu Luo, Yuxuan Zhang, Xiran Lyu, Yifei Pei, Zonghan Jia, Yichen Sun, Linhao Wu, Kunwu Zheng

arXiv:2601.0661569.7h-index: 6

Predicted impact top 26% in SE · last 90 daysOriginality Incremental advance

AI Analysis

This addresses a critical limitation in automated test generation for software developers, representing a domain-specific incremental improvement.

The paper tackles the problem that LLMs for automated unit test generation often neglect to construct necessary test fixtures, proposing Fixturize, a diagnostic framework that identifies fixture-dependent functions and synthesizes fixtures, resulting in accuracy of 88.38%-97.00% in identification and improving Suite Pass rate by 18.03%-42.86% on average.

Current Large Language Models (LLMs) have advanced automated unit test generation but face a critical limitation: they often neglect to construct the necessary test fixtures, which are the environmental setups required for a test to run. To bridge this gap, this paper proposes Fixturize, a diagnostic framework that proactively identifies fixture-dependent functions and synthesizes test fixtures accordingly through an iterative, feedback-driven process, thereby improving the quality of auto-generated test suites of existing approaches. For rigorous evaluation, the authors introduce FixtureEval, a dedicated benchmark comprising 600 curated functions across two Programming Languages (PLs), i.e., Python and Java, with explicit fixture dependency labels, enabling both the corresponding classification and generation tasks. Empirical results demonstrate that Fixturize is highly effective, achieving 88.38%-97.00% accuracy across benchmarks in identifying the dependence of test fixtures and significantly enhancing the Suite Pass rate (SuitePS) by 18.03%-42.86% on average across both PLs with the auto-generated fixtures. Owing to the maintenance of test fixtures, Fixturize further improves line/branch coverage when integrated with existing testing tools of both LLM-based and Search-based by 16.85%/24.08% and 31.54%/119.66% on average, respectively. The findings establish fixture awareness as an essential, missing component in modern auto-testing pipelines.

View on arXiv PDF

Similar