SEJun 28, 2020

A Survey on the Evaluation of Clone Detection Performance and Benchmarking

arXiv:2006.15682v112 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of inconsistent and inadequate evaluation in clone detection research for software engineering researchers and practitioners, but it is incremental as it synthesizes existing studies rather than introducing new methods.

The paper surveys the evaluation of clone detection tools, analyzing benchmarks and tool publications to rank works based on recall, precision, execution time, and scalability, finding that evaluation is generally poor among authors.

There are a great many clone detection tools proposed in the literature. In this paper, we investigate the state of clone detection tool evaluation. We begin by surveying the clone detection benchmarks, and performing a multi-faceted evaluation and comparison of their features and capabilities. We then survey the existing clone detection tool and technique publications, and evaluate how the authors of these works evaluate their own tools/techniques. We rank the individual works by how well they measure recall, precision, execution time and scalability. We select the works the best evaluate all four metrics as exemplars that should be considered by future researchers publishing clone detection tools/techniques when designing the empirical evaluation of their tool/technique. We measure statistics on tool evaluation by the authors, and find that evaluation is poor amongst the authors. We finish our investigation into clone detection evaluation by surveying the existing tool comparison studies, including both the qualitative and quantitative studies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes