The Code2Text Challenge: Text Generation in Source Code Libraries
It proposes a benchmark for text generation in software engineering, which is incremental as it builds on prior data resources.
The paper introduces a new shared task for generating function descriptions from source code libraries, using existing datasets from semantic parser induction studies across multiple natural and programming languages.
We propose a new shared task for tactical data-to-text generation in the domain of source code libraries. Specifically, we focus on text generation of function descriptions from example software projects. Data is drawn from existing resources used for studying the related problem of semantic parser induction (Richardson and Kuhn, 2017b; Richardson and Kuhn, 2017a), and spans a wide variety of both natural languages and programming languages. In this paper, we describe these existing resources, which will serve as training and development data for the task, and discuss plans for building new independent test sets.