CLJul 31, 2017

The Code2Text Challenge: Text Generation in Source Code Libraries

arXiv:1708.00098v15 citations
Originality Synthesis-oriented
AI Analysis

It proposes a benchmark for text generation in software engineering, which is incremental as it builds on prior data resources.

The paper introduces a new shared task for generating function descriptions from source code libraries, using existing datasets from semantic parser induction studies across multiple natural and programming languages.

We propose a new shared task for tactical data-to-text generation in the domain of source code libraries. Specifically, we focus on text generation of function descriptions from example software projects. Data is drawn from existing resources used for studying the related problem of semantic parser induction (Richardson and Kuhn, 2017b; Richardson and Kuhn, 2017a), and spans a wide variety of both natural languages and programming languages. In this paper, we describe these existing resources, which will serve as training and development data for the task, and discuss plans for building new independent test sets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes