LGSEMLNov 14, 2018

A Grammar-Based Structural CNN Decoder for Code Generation

arXiv:1811.06837v1136 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge of generating long sequences in code for developers, though it is incremental as it builds on existing neural network methods.

The paper tackles the problem of code generation from program descriptions by proposing a grammar-based structural CNN decoder, which outperforms the previous state-of-the-art method by 5 percentage points on the HearthStone benchmark dataset.

Code generation maps a program description to executable source code in a programming language. Existing approaches mainly rely on a recurrent neural network (RNN) as the decoder. However, we find that a program contains significantly more tokens than a natural language sentence, and thus it may be inappropriate for RNN to capture such a long sequence. In this paper, we propose a grammar-based structural convolutional neural network (CNN) for code generation. Our model generates a program by predicting the grammar rules of the programming language; we design several CNN modules, including the tree-based convolution and pre-order convolution, whose information is further aggregated by dedicated attentive pooling layers. Experimental results on the HearthStone benchmark dataset show that our CNN code generator significantly outperforms the previous state-of-the-art method by 5 percentage points; additional experiments on several semantic parsing tasks demonstrate the robustness of our model. We also conduct in-depth ablation test to better understand each component of our model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes