SELGMar 31, 2021

HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural Network for Code Documentation Generation in Jupyter Notebooks

arXiv:2104.01002v2665 citations
AI Analysis

This addresses code documentation generation for data scientists using computational notebooks, but it is incremental as it builds on existing CDG tasks with a focus on multi-cell structure.

The paper tackles the problem of generating documentation for multiple code cells in Jupyter notebooks, proposing HAConvGNN with hierarchical attention, and shows it outperforms baselines on a new Kaggle corpus.

Jupyter notebook allows data scientists to write machine learning code together with its documentation in cells. In this paper, we propose a new task of code documentation generation (CDG) for computational notebooks. In contrast to the previous CDG tasks which focus on generating documentation for single code snippets, in a computational notebook, one documentation in a markdown cell often corresponds to multiple code cells, and these code cells have an inherent structure. We proposed a new model (HAConvGNN) that uses a hierarchical attention mechanism to consider the relevant code cells and the relevant code tokens information when generating the documentation. Tested on a new corpus constructed from well-documented Kaggle notebooks, we show that our model outperforms other baseline models.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes