CSA-Trans: Code Structure Aware Transformer for AST
This work addresses the challenge of efficiently processing source code for tasks like code summarization, offering incremental improvements in speed and memory usage for developers and AI tools.
The paper tackles the problem of improving Transformer models for source code by designing a better self-attention mechanism for Abstract Syntax Trees (ASTs), resulting in CSA-Trans, which outperforms 14 baselines in code summarization tasks for Python and Java, with gains of 41.92% faster speed and 25.31% memory efficiency in Java.
When applying the Transformer architecture to source code, designing a good self-attention mechanism is critical as it affects how node relationship is extracted from the Abstract Syntax Trees (ASTs) of the source code. We present Code Structure Aware Transformer (CSA-Trans), which uses Code Structure Embedder (CSE) to generate specific PE for each node in AST. CSE generates node Positional Encoding (PE) using disentangled attention. To further extend the self-attention capability, we adopt Stochastic Block Model (SBM) attention. Our evaluation shows that our PE captures the relationships between AST nodes better than other graph-related PE techniques. We also show through quantitative and qualitative analysis that SBM attention is able to generate more node specific attention coefficients. We demonstrate that CSA-Trans outperforms 14 baselines in code summarization tasks for both Python and Java, while being 41.92% faster and 25.31% memory efficient in Java dataset compared to AST-Trans and SG-Trans respectively.