LG SE MLAug 30, 2020

A Novel Multiple Ensemble Learning Models Based on Different Datasets for Software Defect Prediction

Ali Nawaz, Attique Ur Rehman, Muhammad Abbas

arXiv:2008.13114v11.25 citations

Originality Synthesis-oriented

AI Analysis

This addresses software testing efficiency for developers, but is incremental as it applies existing ensemble methods to standard datasets.

The paper tackled software defect prediction by comparing ensemble learning models against KNN, Decision Tree, SVM, and Naïve Bayes on multiple datasets, achieving high classification accuracies such as 99.27% on PC1.

Software testing is one of the important ways to ensure the quality of software. It is found that testing cost more than 50% of overall project cost. Effective and efficient software testing utilizes the minimum resources of software. Therefore, it is important to construct the procedure which is not only able to perform the efficient testing but also minimizes the utilization of project resources. The goal of software testing is to find maximum defects in the software system. More the defects found in the software ensure more efficiency is the software testing Different techniques have been proposed to detect the defects in software and to utilize the resources and achieve good results. As world is continuously moving toward data driven approach for making important decision. Therefore, in this research paper we performed the machine learning analysis on the publicly available datasets and tried to achieve the maximum accuracy. The major focus of the paper is to apply different machine learning techniques on the datasets and find out which technique produce efficient result. Particularly, we proposed an ensemble learning models and perform comparative analysis among KNN, Decision tree, SVM and Naïve Bayes on different datasets and it is demonstrated that performance of Ensemble method is more than other methods in term of accuracy, precision, recall and F1-score. The classification accuracy of ensemble model trained on CM1 is 98.56%, classification accuracy of ensemble model trained on KM2 is 98.18% similarly, the classification accuracy of ensemble learning model trained on PC1 is 99.27%. This reveals that Ensemble is more efficient method for making the defect prediction as compared other techniques.

View on arXiv PDF

Similar