SEDec 12, 2018

Towards Automating Precision Studies of Clone Detectors

Vaibhav Saini, Farima Farmahinifarahani, Yadong Lu, Di Yang, Pedro Martins, Hitesh Sajnani, Pierre Baldi, Cristina Lopes

arXiv:1812.05195v210 citations

Originality Incremental advance

AI Analysis

This addresses the labor-intensive and inconsistent evaluation process in software clone research, though it is incremental as it builds on existing precision assessment needs.

The paper tackles the problem of evaluating precision in clone detection tools by developing a semi-automated approach that combines automatic classification with manual validation, reducing the number of clone pairs requiring human validation by a significant amount and creating a shared dataset for the research community.

Current research in clone detection suffers from poor ecosystems for evaluating precision of clone detection tools. Corpora of labeled clones are scarce and incomplete, making evaluation labor intensive and idiosyncratic, and limiting inter tool comparison. Precision-assessment tools are simply lacking. We present a semi-automated approach to facilitate precision studies of clone detection tools. The approach merges automatic mechanisms of clone classification with manual validation of clone pairs. We demonstrate that the proposed automatic approach has a very high precision and it significantly reduces the number of clone pairs that need human validation during precision experiments. Moreover, we aggregate the individual effort of multiple teams into a single evolving dataset of labeled clone pairs, creating an important asset for software clone research.

View on arXiv PDF

Similar