SEAIPLMay 31, 2022

HierarchyNet: Learning to Summarize Source Code with Heterogeneous Representations

arXiv:2205.15479v3107 citationsh-index: 17
Originality Incremental advance
AI Analysis

It addresses the problem of generating summaries for source code, which aids developers in understanding and maintaining software, but appears incremental as it builds on existing techniques with specific architectural improvements.

The paper tackles code summarization by proposing HierarchyNet, which uses heterogeneous code representations and a novel architecture combining graph transformers, tree-based CNNs, and transformers to capture code features at multiple levels, achieving state-of-the-art results over methods like PA-Former, CAST, and NeuralCodeSum.

We propose a novel method for code summarization utilizing Heterogeneous Code Representations (HCRs) and our specially designed HierarchyNet. HCRs effectively capture essential code features at lexical, syntactic, and semantic levels by abstracting coarse-grained code elements and incorporating fine-grained program elements in a hierarchical structure. Our HierarchyNet method processes each layer of the HCR separately through a unique combination of the Heterogeneous Graph Transformer, a Tree-based CNN, and a Transformer Encoder. This approach preserves dependencies between code elements and captures relations through a novel Hierarchical-Aware Cross Attention layer. Our method surpasses current state-of-the-art techniques, such as PA-Former, CAST, and NeuralCodeSum.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes