CLAug 29, 2018

Mapping Language to Code in Programmatic Context

arXiv:1808.09588v11167 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of code generation in realistic programming environments, though it is incremental as it builds on existing encoder-decoder methods for a specific domain.

The paper tackles the problem of generating class member functions from English documentation within the programmatic context of the rest of the class, introducing the CONCODE dataset with over 100,000 Java examples and a new encoder-decoder architecture for this task.

Source code is rarely written in isolation. It depends significantly on the programmatic context, such as the class that the code would reside in. To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class. This task is challenging because the desired code can vary greatly depending on the functionality the class provides (e.g., a sort function may or may not be available when we are asked to "return the smallest element" in a particular member variable list). We introduce CONCODE, a new large dataset with over 100,000 examples consisting of Java classes from online code repositories, and develop a new encoder-decoder architecture that models the interaction between the method documentation and the class environment. We also present a detailed error analysis suggesting that there is significant room for future work on this task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes