AutoEmpirical: LLM-Based Automated Research for Empirical Software Fault Analysis
This addresses the bottleneck of time-consuming manual fault analysis for software developers and researchers, though it is an incremental step towards fully automated analysis.
The paper tackled the labor-intensive process of empirical software fault analysis by applying Large Language Models (LLMs) to automate fault analysis, resulting in an average processing time of about two hours compared to weeks of manual effort, as evaluated on 3,829 software faults.
Understanding software faults is essential for empirical research in software development and maintenance. However, traditional fault analysis, while valuable, typically involves multiple expert-driven steps such as collecting potential faults, filtering, and manual investigation. These processes are both labor-intensive and time-consuming, creating bottlenecks that hinder large-scale fault studies in complex yet critical software systems and slow the pace of iterative empirical research. In this paper, we decompose the process of empirical software fault study into three key phases: (1) research objective definition, (2) data preparation, and (3) fault analysis, and we conduct an initial exploration study of applying Large Language Models (LLMs) for fault analysis of open-source software. Specifically, we perform the evaluation on 3,829 software faults drawn from a high-quality empirical study. Our results show that LLMs can substantially improve efficiency in fault analysis, with an average processing time of about two hours, compared to the weeks of manual effort typically required. We conclude by outlining a detailed research plan that highlights both the potential of LLMs for advancing empirical fault studies and the open challenges that required be addressed to achieve fully automated, end-to-end software fault analysis.