CLFeb 28, 2022

The impact of lexical and grammatical processing on generating code from natural language

arXiv:2202.13972v2639 citations
AI Analysis

This work addresses code generation from natural language for developers, but it is incremental as it builds on existing architectures like TranX.

The study investigated the impact of lexical and grammatical processing on generating code from natural language using a BERT encoder and grammar-based decoder, finding that lexical substitution is a key component in current systems.

Considering the seq2seq architecture of TranX for natural language to code translation, we identify four key components of importance: grammatical constraints, lexical preprocessing, input representations, and copy mechanisms. To study the impact of these components, we use a state-of-the-art architecture that relies on BERT encoder and a grammar-based decoder for which a formalization is provided. The paper highlights the importance of the lexical substitution component in the current natural language to code systems.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes