Learning Structural Edits via Incremental Tree Transformations
This work addresses the need for more human-like, iterative editing processes in neural generative models for structured data, particularly in programming, but it is incremental as it builds on prior editing models by extending them to tree structures.
The paper tackles the problem of generating structured data by modeling incremental tree edits for iterative refinement, focusing on abstract syntax trees of computer programs, and shows that their editor with a novel edit encoder significantly improves accuracy over single-pass generation methods on source code edit datasets.
While most neural generative models generate outputs in a single pass, the human creative process is usually one of iterative building and refinement. Recent work has proposed models of editing processes, but these mostly focus on editing sequential data and/or only model a single editing pass. In this paper, we present a generic model for incremental editing of structured data (i.e., "structural edits"). Particularly, we focus on tree-structured data, taking abstract syntax trees of computer programs as our canonical example. Our editor learns to iteratively generate tree edits (e.g., deleting or adding a subtree) and applies them to the partially edited data, thereby the entire editing process can be formulated as consecutive, incremental tree transformations. To show the unique benefits of modeling tree edits directly, we further propose a novel edit encoder for learning to represent edits, as well as an imitation learning method that allows the editor to be more robust. We evaluate our proposed editor on two source code edit datasets, where results show that, with the proposed edit encoder, our editor significantly improves accuracy over previous approaches that generate the edited program directly in one pass. Finally, we demonstrate that training our editor to imitate experts and correct its mistakes dynamically can further improve its performance.