SEOct 2, 2020

Augmenting Machine Learning with Information Retrieval to Recommend Real Cloned Code Methods for Code Completion

arXiv:2010.00964v12 citations
Originality Incremental advance
AI Analysis

This addresses the issue for software developers needing reliable code reuse, though it is incremental as it builds on an existing model.

The paper tackles the problem of generating error-prone code clones for reuse by proposing an information retrieval technique on top of a deep learning model to recommend real clone methods, resulting in significantly improved recommendation quality.

Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones accumulated in these repositories hence represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. In previous work, we introduced DeepClone, a deep neural network model trained by fine tuning GPT-2 model over the BigCloneBench dataset to predict code clone methods. The probabilistic nature of DeepClone output generation can lead to syntax and logic errors that requires manual editing of the output for final reuse. In this paper, we propose a novel approach of applying an information retrieval (IR) technique on top of DeepClone output to recommend real clone methods closely matching the predicted output. We have quantitatively evaluated our strategy, showing that the proposed approach significantly improves the quality of recommendation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes