SE AI LGAug 8, 2021

An Empirical Study on Predictability of Software Code Smell Using Deep Learning Models

Himanshu Gupta, Tanmay G. Kulkarni, Lov Kumar, Lalita Bhanu Murthy Neti, Aneesh Krishna

arXiv:2108.04659v113.320 citations

Originality Synthesis-oriented

AI Analysis

This work addresses code quality prediction for software developers, but it is incremental as it applies existing deep learning methods to a domain previously explored with other techniques.

The paper tackled the problem of predicting software code smells using deep learning models, achieving accuracy improvements from 88.47% to 96.84% by applying data sampling and feature selection techniques.

Code Smell, similar to a bad smell, is a surface indication of something tainted but in terms of software writing practices. This metric is an indication of a deeper problem lies within the code and is associated with an issue which is prominent to experienced software developers with acceptable coding practices. Recent studies have often observed that codes having code smells are often prone to a higher probability of change in the software development cycle. In this paper, we developed code smell prediction models with the help of features extracted from source code to predict eight types of code smell. Our work also presents the application of data sampling techniques to handle class imbalance problem and feature selection techniques to find relevant feature sets. Previous studies had made use of techniques such as Naive - Bayes and Random forest but had not explored deep learning methods to predict code smell. A total of 576 distinct Deep Learning models were trained using the features and datasets mentioned above. The study concluded that the deep learning models which used data from Synthetic Minority Oversampling Technique gave better results in terms of accuracy, AUC with the accuracy of some models improving from 88.47 to 96.84.

View on arXiv PDF

Similar