Better Predictors for Issue Lifetime
This work addresses the need for accurate and interpretable issue lifetime predictors for software developers and managers, though it is incremental as it builds on prior methods with specific improvements.
The paper tackled the problem of predicting issue lifetime in software development by using small, readable decision trees and correlation feature selection, achieving medians of 71% precision and 13% false alarms, and proposed using cross-project data to handle class imbalance.
Predicting issue lifetime can help software developers, managers, and stakeholders effectively prioritize work, allocate development resources, and better understand project timelines. Progress had been made on this prediction problem, but prior work has reported low precision and high false alarms. The latest results also use complex models such as random forests that detract from their readability. We solve both issues by using small, readable decision trees (under 20 lines long) and correlation feature selection to predict issue lifetime, achieving high precision and low false alarms (medians of 71% and 13% respectively). We also address the problem of high class imbalance within issue datasets - when local data fails to train a good model, we show that cross-project data can be used in place of the local data. In fact, cross-project data works so well that we argue it should be the default approach for learning predictors for issue lifetime.