SELGAug 18, 2022

Learning Program Representations with a Tree-Structured Transformer

Peking U
arXiv:2208.08643v212 citationsh-index: 28
AI Analysis

This addresses a critical bottleneck in applying deep learning to program analysis, offering a novel solution for tasks like bug localization, though it is incremental in advancing tree-structured neural networks.

The paper tackles the problem of learning vector representations for programs to improve program understanding tasks, proposing Tree-Transformer, which significantly outperforms existing tree-based and graph-based methods in tree-level and node-level prediction tasks.

Learning vector representations for programs is a critical step in applying deep learning techniques for program understanding tasks. Various neural network models are proposed to learn from tree-structured program representations, e.g., abstract syntax tree (AST) and concrete syntax tree (CST). However, most neural architectures either fail to capture long-range dependencies which are ubiquitous in programs, or cannot learn effective representations for syntax tree nodes, making them incapable of performing the node-level prediction tasks, e.g., bug localization. In this paper, we propose Tree-Transformer, a novel recursive tree-structured neural network to learn the vector representations for source codes. We propose a multi-head attention mechanism to model the dependency between siblings and parent-children node pairs. Moreover, we propose a bi-directional propagation strategy to allow node information passing in two directions, bottom-up and top-down along trees. In this way, Tree-Transformer can learn the information of the node features as well as the global contextual information. The extensive experimental results show that our Tree-Transformer significantly outperforms the existing tree-based and graph-based program representation learning approaches in both the tree-level and node-level prediction tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes