SELGNov 14, 2020

Classification of Reverse-Engineered Class Diagram and Forward-Engineered Class Diagram using Machine Learning

arXiv:2011.07313v16 citations
AI Analysis

This addresses a specific need in the software industry for identifying diagram types in open-source projects, but it is incremental as it applies existing methods to a new classification task.

The paper tackled the problem of classifying UML class diagrams as forward-engineered or reverse-engineered by building a classifier using supervised machine learning, with the Random Forest algorithm achieving the best performance on a dataset of 999 diagrams.

UML Class diagram is very important to visualize the whole software we are working on and helps understand the whole system in the easiest way possible by showing the system classes, its attributes, methods, and relations with other objects. In the real world, there are two types of Class diagram engineers work with namely 1) Forward Engineered Class Diagram (FwCD) which are hand-made as part of the forward-looking development process, and 2). Reverse Engineered Class Diagram (RECD) which are those diagrams that are reverse engineered from the source code. In the software industry while working with new open software projects it is important to know which type of class diagram it is. Which UML diagram was used in a particular project is an important factor to be known? To solve this problem, we propose to build a classifier that can classify a UML diagram into FwCD or RECD. We propose to solve this problem by using a supervised Machine Learning technique. The approach in this involves analyzing the features that are useful in classifying class diagrams. Different Machine Learning models are used in this process and the Random Forest algorithm has proved to be the best out of all. Performance testing was done on 999 Class diagrams.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes