SEMar 2, 2015

Smelling out Code Clones: Clone Detection Tool Evaluation and Corresponding Challenges

arXiv:1503.00711v13 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental review that identifies ongoing problems in tool evaluation for software clone detection, relevant for researchers and practitioners in software engineering.

The paper addresses the challenge of evaluating clone detection tools in software engineering, highlighting the lack of standard benchmarks and the complexity of comparing tools due to varied parameters and outputs. It reviews existing tools and frameworks, including benchmarks and a method for finding optimal configurations.

Software clones have been an active area of research for the past two decades. However, although numerous clone detection tools are now available, only a small fraction of the literature has focused on tool evaluation, and this is in fact still an open problem. This is mostly due to the fact that standard information retrieval metrics such as recall and precision require a priori knowledge of clones already in the system. Detection tools also typically have a large number of parameters which are difficult to fine-tune for optimal performance on a particular software system, and different outputs produced by different tools add to the complexity of comparing one tool to another. In this review, we further explore the reasons why tool evaluation is still an open challenge, and present the current tools and frameworks targeted at mitigating these problems, focusing on the current standard benchmarks used to evaluate modern clone detection tools, and also presenting a recent method aimed at finding optimal tool configurations.

View on arXiv PDF

Similar