Towards an automated approach for bug fix pattern detection
This work addresses the tedious and time-consuming task of characterizing bug datasets for automatic program repair tools, offering an incremental improvement in efficiency.
The paper tackles the problem of manually analyzing bug fix patterns in datasets by proposing PPD, an automated detector that achieves 91% precision and 92% recall on the Defects4J dataset, matching human manual analysis.
The characterization of bug datasets is essential to support the evaluation of automatic program repair tools. In a previous work, we manually studied almost 400 human-written patches (bug fixes) from the Defects4J dataset and annotated them with properties, such as repair patterns. However, manually finding these patterns in different datasets is tedious and time-consuming. To address this activity, we designed and implemented PPD, a detector of repair patterns in patches, which performs source code change analysis at abstract-syntax tree level. In this paper, we report on PPD and its evaluation on Defects4J, where we compare the results from the automated detection with the results from the previous manual analysis. We found that PPD has overall precision of 91% and overall recall of 92%, and we conclude that PPD has the potential to detect as many repair patterns as human manual analysis.