SEMar 18, 2017

Defect prediction with bad smells in code

Jarosław Hryszko, Lech Madeyski, Marta Dąbrowska, Piotr Konopka

arXiv:1703.06300v12.93 citations

Originality Synthesis-oriented

AI Analysis

This work addresses defect prediction for software developers, but it is incremental as it builds on existing methods with minor improvements.

The study investigated whether adding code smell metrics to a basic metric set improves defect prediction in industrial software development, finding only a small accuracy increase of 0.0091, but using code smells alone achieved high accuracy (0.8249) and F-measure (0.8286).

Background: Defect prediction in software can be highly beneficial for development projects, when prediction is highly effective and defect-prone areas are predicted correctly. One of the key elements to gain effective software defect prediction is proper selection of metrics used for dataset preparation. Objective: The purpose of this research is to verify, whether code smells metrics, collected using Microsoft CodeAnalysis tool, added to basic metric set, can improve defect prediction in industrial software development project. Results: We verified, if dataset extension by the code smells sourced metrics, change the effectiveness of the defect prediction by comparing prediction results for datasets with and without code smells-oriented metrics. In a result, we observed only small improvement of effectiveness of defect prediction when dataset extended with bad smells metrics was used: average accuracy value increased by 0.0091 and stayed within the margin of error. However, when only use of code smells based metrics were used for prediction (without basic set of metrics), such process resulted with surprisingly high accuracy (0.8249) and F-measure (0.8286) results. We also elaborated data anomalies and problems we observed when two different metric sources were used to prepare one, consistent set of data. Conclusion: Extending the dataset by the code smells sourced metric does not significantly improve the prediction effectiveness. Achieved result did not compensate effort needed to collect additional metrics. However, we observed that defect prediction based on the code smells only is still highly effective and can be used especially where other metrics hardly be used.

View on arXiv PDF

Similar