SEAINov 28, 2021

Code Clone Detection based on Event Embedding and Event Dependency

arXiv:2111.14183v17 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses a problem in software engineering for tasks like software evolution and reuse, but it is incremental as it builds on existing semantic similarity approaches.

The authors tackled the problem of detecting semantically similar code clones, which traditional syntax-based methods often miss, by proposing the EDAM model that encodes code as interdependent events. Experimental results show that EDAM outperforms state-of-the-art open-source models for code clone detection.

The code clone detection method based on semantic similarity has important value in software engineering tasks (e.g., software evolution, software reuse). Traditional code clone detection technologies pay more attention to the similarity of code at the syntax level, and less attention to the semantic similarity of the code. As a result, candidate codes similar in semantics are ignored. To address this issue, we propose a code clone detection method based on semantic similarity. By treating code as a series of interdependent events that occur continuously, we design a model namely EDAM to encode code semantic information based on event embedding and event dependency. The EDAM model uses the event embedding method to model the execution characteristics of program statements and the data dependence information between all statements. In this way, we can embed the program semantic information into a vector and use the vector to detect codes similar in semantics. Experimental results show that the performance of our EDAM model is superior to state of-the-art open source models for code clone detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes