Exploring Dynamic Selection of Branch Expansion Orders for Code Generation
This work addresses a specific bottleneck in code generation for software development, offering an incremental improvement over existing Seq2Tree models.
The paper tackles the problem of suboptimal pre-order traversal for multi-branch nodes in Seq2Tree code generation models by introducing a context-based Branch Selector that dynamically determines optimal expansion orders, optimized via reinforcement learning with a reward based on model loss differences, showing effectiveness and generality on common datasets.
Due to the great potential in facilitating software development, code generation has attracted increasing attention recently. Generally, dominant models are Seq2Tree models, which convert the input natural language description into a sequence of tree-construction actions corresponding to the pre-order traversal of an Abstract Syntax Tree (AST). However, such a traversal order may not be suitable for handling all multi-branch nodes. In this paper, we propose to equip the Seq2Tree model with a context-based Branch Selector, which is able to dynamically determine optimal expansion orders of branches for multi-branch nodes. Particularly, since the selection of expansion orders is a non-differentiable multi-step operation, we optimize the selector through reinforcement learning, and formulate the reward function as the difference of model losses obtained through different expansion orders. Experimental results and in-depth analysis on several commonly-used datasets demonstrate the effectiveness and generality of our approach. We have released our code at https://github.com/DeepLearnXMU/CG-RL.