Towards Informative Tagging of Code Fragments to Support the Investigation of Code Clones
This work addresses a domain-specific problem for software developers and researchers dealing with code clone analysis, but it is incremental as it builds on existing clone detection tools.
The paper tackles the time-consuming task of investigating code clones by proposing a method to cluster clone classes based on topic similarity and assign short tags to these clusters, with an experiment applied to packages of an open source operating system.
Investigating the code fragments of code clones detected by code clone detection tools is a time-consuming task, especially when a large number of reference source files are available. This paper proposes (i) a method for clustering a clone class, which is detected by code clone detection tools using syntactic similarity, based on topic similarity by considering its code fragments as sequences of words and (ii) a method for assigning short tags to clusters of the clustering result. We also report an experiment of applying the proposed method to packages of an open source operating system.