Benchmarking cross-project defect prediction approaches with costs metrics
This work highlights a critical gap in evaluating CPDP for software quality assurance, showing incremental insights by revealing that current methods may not be cost-effective compared to trivial baselines.
The paper benchmarks 26 cross-project defect prediction (CPDP) approaches using cost metrics and finds that assuming everything is defective performs better on average than CPDP under cost considerations, with rankings based on cost metrics uncorrelated to those without cost considerations.
Defect prediction can be a powerful tool to guide the use of quality assurance resources. In recent years, many researchers focused on the problem of Cross-Project Defect Prediction (CPDP), i.e., the creation of prediction models based on training data from other projects. However, only few of the published papers evaluate the cost efficiency of predictions, i.e., if they save costs if they are used to guide quality assurance efforts. Within this paper, we provide a benchmark of 26 CPDP approaches based on cost metrics. Our benchmark shows that trivially assuming everything as defective is on average better than CPDP under cost considerations. Moreover, we show that our ranking of approaches using cost metrics is uncorrelated to a ranking based on metrics that do not directly consider costs. These findings show that we must put more effort into evaluating the actual benefits of CPDP, as the current state of the art of CPDP can actually be beaten by a trivial approach in cost-oriented evaluations.