Can clone detection support quality assessments of requirements specifications?
This addresses the challenge of quality assurance for software requirements specifications, which are often in natural language and lack automated assessment tools, though it is incremental in applying an existing technique to a new domain.
The paper tackled the problem of automated quality assessment for software requirements specifications by applying clone detection to identify redundancy from copy&paste operations, finding significant redundancy in 28 real-world specifications totaling 8,667 pages.
Due to their pivotal role in software engineering, considerable effort is spent on the quality assurance of software requirements specifications. As they are mainly described in natural language, relatively few means of automated quality assessment exist. However, we found that clone detection, a technique widely applied to source code, is promising to assess one important quality aspect in an automated way, namely redundancy that stems from copy&paste operations. This paper describes a large-scale case study that applied clone detection to 28 requirements specifications with a total of 8,667 pages. We report on the amount of redundancy found in real-world specifications, discuss its nature as well as its consequences and evaluate in how far existing code clone detection approaches can be applied to assess the quality of requirements specifications in practice.