LGSEJun 18, 2021

Learning to Complete Code with Sketches

arXiv:2106.10158v244 citations
Originality Incremental advance
AI Analysis

This addresses code completion for developers by improving accuracy and sketch length, though it is incremental as it builds on existing grammar-guided methods.

The paper tackles the problem of code completion by generating completions with 'holes' for uncertain parts, resulting in Grammformer producing 10-50% more accurate completions and 37-50% longer sketches compared to baselines.

Code completion is usually cast as a language modelling problem, i.e., continuing an input in a left-to-right fashion. However, in practice, some parts of the completion (e.g., string literals) may be very hard to predict, whereas subsequent parts directly follow from the context. To handle this, we instead consider the scenario of generating code completions with "holes" inserted in places where a model is uncertain. We develop Grammformer, a Transformer-based model that guides code generation by the programming language grammar, and compare it to a variety of more standard sequence models. We train the models on code completion for C# and Python given partial code context. To evaluate models, we consider both ROUGE as well as a new metric RegexAcc that measures success of generating completions matching long outputs with as few holes as possible. In our experiments, Grammformer generates 10-50% more accurate completions compared to traditional generative models and 37-50% longer sketches compared to sketch-generating baselines trained with similar techniques.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes