SE CLSep 10, 2019

An Evalutation of Programming Language Models' performance on Software Defect Detection

arXiv:1909.10309v1

Originality Synthesis-oriented

AI Analysis

This work addresses software defect detection for developers and researchers, but it is incremental as it applies existing models to a new domain.

This study evaluated multiple language models for detecting software defects at syntactical, algorithmic, and general levels, finding that BERT matched or outperformed all other models tested.

This dissertation presents an evaluation of several language models on software defect datasets. A language Model (LM) "can provide word representation and probability indication of word sequences as the core component of an NLP system." Language models for source code are specified for tasks in the software engineering field. While some models are directly the NLP ones, others contain structural information that is uniquely owned by source code. Software defects are defects in the source code that lead to unexpected behaviours and malfunctions at all levels. This study provides an original attempt to detect these defects at three different levels (syntactical, algorithmic and general) We also provide a tool chain that researchers can use to reproduce the experiments. We have tested the different models against different datasets, and performed an analysis over the results. Our original attempt to deploy bert, the state-of-the-art model for multitasks, leveled or outscored all other models compared.

View on arXiv PDF

Similar