SE MEApr 19

A Code Smell Refactoring Approach using GNNs

arXiv:2511.120692.0h-index: 2

Predicted impact top 88% in SE · last 90 daysOriginality Synthesis-oriented

AI Analysis

This work addresses the challenge of automated code smell refactoring for software engineers, but the improvement is incremental over existing deep learning approaches.

The authors propose a graph neural network (GNN)-based approach for refactoring three code smells (long method, large class, feature envy), using class-level and method-level input graphs with graph classification and node classification tasks. They achieve superior refactoring performance compared to traditional and state-of-the-art deep learning methods, enabled by a semi-automated dataset generation technique.

Code smell is a great challenge in software refactoring, which indicates latent design or implementation flaws that may degrade the software maintainability and evolution. Over the past decades, a variety of refactoring approaches have been proposed, which can be broadly classified into metrics-based, rule-based, and machine learning-based approaches. Recent years, deep learning-based approaches have also attracted widespread attention. However, existing techniques exhibit various limitations. Metrics- and rule-based approaches rely heavily on manually defined heuristics and thresholds, whereas deep learning-based approaches are often constrained by dataset availability and model design. In this study, we proposed a graph-based deep learning approach for code smell refactoring. Specifically, we designed two types of input graphs (class-level and method-level) and employed both graph classification and node classification tasks to address the refactoring of three representative code smells: long method, large class, and feature envy. In our experiment, we propose a semi-automated dataset generation approach that could generate a large-scale dataset with minimal manual effort. We implemented the proposed approach with three classical GNN (graph neural network) architectures: GCN, GraphSAGE, and GAT, and evaluated its performance against both traditional and state-of-the-art deep learning approaches. The results demonstrate that proposed approach achieves superior refactoring performance.

View on arXiv PDF

Similar