SEJul 28, 2016

Towards Automated Performance Bug Identification in Python

arXiv:1607.08506v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses performance bug identification for software developers in domains like advertising and real-time systems, but it is incremental as it applies existing defect prediction methods to a new type of bug.

The paper tackled the problem of early detection of performance bugs in Python software, specifically in a real-time advertising/marketing system, and found that a C4.5 decision tree model using lines of code changed, file age, and size achieved a recall of 0.73, accuracy of 0.85, and precision of 0.96 for prediction.

Context: Software performance is a critical non-functional requirement, appearing in many fields such as mission critical applications, financial, and real time systems. In this work we focused on early detection of performance bugs; our software under study was a real time system used in the advertisement/marketing domain. Goal: Find a simple and easy to implement solution, predicting performance bugs. Method: We built several models using four machine learning methods, commonly used for defect prediction: C4.5 Decision Trees, Naïve Bayes, Bayesian Networks, and Logistic Regression. Results: Our empirical results show that a C4.5 model, using lines of code changed, file's age and size as explanatory variables, can be used to predict performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that reducing the number of changes delivered on a commit, can decrease the chance of performance bug injection. Conclusions: We believe that our approach can help practitioners to eliminate performance bugs early in the development cycle. Our results are also of interest to theoreticians, establishing a link between functional bugs and (non-functional) performance bugs, and explicitly showing that attributes used for prediction of functional bugs can be used for prediction of performance bugs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes