Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction
This work addresses the problem of low accuracy in software defect prediction for developers and testers, offering a novel approach that is incremental in applying deep learning to graph representations.
The paper tackles software defect prediction by using control flow graphs and deep graph convolutional neural networks to learn semantic features, achieving significant performance improvements over baselines on four real-world datasets.
Existing defects in software components is unavoidable and leads to not only a waste of time and money but also many serious consequences. To build predictive models, previous studies focus on manually extracting features or using tree representations of programs, and exploiting different machine learning algorithms. However, the performance of the models is not high since the existing features and tree structures often fail to capture the semantics of programs. To explore deeply programs' semantics, this paper proposes to leverage precise graphs representing program execution flows, and deep neural networks for automatically learning defect features. Firstly, control flow graphs are constructed from the assembly instructions obtained by compiling source code; we thereafter apply multi-view multi-layer directed graph-based convolutional neural networks (DGCNNs) to learn semantic features. The experiments on four real-world datasets show that our method significantly outperforms the baselines including several other deep learning approaches.