SEAICLMar 15, 2022

ReACC: A Retrieval-Augmented Code Completion Framework

arXiv:2203.07722v1694 citationsh-index: 66
Originality Incremental advance
AI Analysis

This improves code completion for software developers by integrating external code snippets, though it is incremental over existing transformer-based methods.

The paper tackles code completion by incorporating external context through retrieval, achieving state-of-the-art performance on the CodeXGLUE benchmark for Python and Java.

Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly improve the performance in the code completion task via learning from large-scale source code datasets. However, current approaches focus only on code context within the file or project, i.e. internal context. Our distinction is utilizing "external" context, inspired by human behaviors of copying from the related code snippets when writing code. Specifically, we propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We adopt a stage-wise training approach that combines a source code retriever and an auto-regressive language model for programming language. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes