SEFeb 8, 2019

Code Smell Detection using Multilabel Classification Approach

arXiv:1902.03222v188 citations
Originality Incremental advance
AI Analysis

This addresses the issue of subjective and limited code smell detection for software developers, though it is incremental as it builds on existing machine learning approaches.

The paper tackles the problem of detecting multiple code smells simultaneously in software, which existing tools fail to do, by applying multilabel classification methods to datasets, achieving good performance in cross-validation.

Code smells are characteristics of the software that indicates a code or design problem which can make software hard to understand, evolve, and maintain. The code smell detection tools proposed in the literature produce different results, as smells are informally defined or are subjective in nature. To address the issue of tool subjectivity, machine learning techniques have been proposed which can learn and distinguish the characteristics of smelly and non-smelly source code elements (classes or methods). However, the existing machine learning techniques can only detect a single type of smell in the code element which does not correspond to a real-world scenario. In this paper, we have used multilabel classification methods to detect whether the given code element is affected by multiple smells or not. We have considered two code smell datasets for this work and converted them into a multilabel dataset. In our experimentation, Two multilabel methods performed on the converted dataset which demonstrates good performances in the 10-fold cross-validation, using ten repetitions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes