SEJul 17, 2018

Automatic Traceability Maintenance via Machine Learning Classification

arXiv:1807.06684v162 citations
Originality Incremental advance
AI Analysis

This addresses the costly issue of outdated traceability for software developers and stakeholders, offering an incremental improvement over existing methods.

The paper tackles the problem of maintaining software traceability links as projects evolve by proposing TRAIL, a novel approach that uses machine learning classification trained on existing traceability knowledge to update links, and it outperforms seven information retrieval techniques in precision, recall, and F-score on 11 datasets.

Previous studies have shown that software traceability, the ability to link together related artifacts from different sources within a project (e.g., source code, use cases, documentation, etc.), improves project outcomes by assisting developers and other stakeholders with common tasks such as impact analysis, concept location, etc. Establishing traceability links in a software system is an important and costly task, but only half the struggle. As the project undergoes maintenance and evolution, new artifacts are added and existing ones are changed, resulting in outdated traceability information. Therefore, specific steps need to be taken to make sure that traceability links are maintained in tandem with the rest of the project. In this paper we address this problem and propose a novel approach called TRAIL for maintaining traceability information in a system. The novelty of TRAIL stands in the fact that it leverages previously captured knowledge about project traceability to train a machine learning classifier which can then be used to derive new traceability links and update existing ones. We evaluated TRAIL on 11 commonly used traceability datasets from six software systems and compared it to seven popular information Retrieval (IR) techniques including the most common approaches used in previous work. The results indicate that TRAIL outperforms all IR approaches in terms of precision, recall, and F-score.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes