LG AI PL MLJul 6, 2018

NAPS: Natural Program Synthesis Dataset

Maksym Zavershynskyi, Alex Skidanov, Illia Polosukhin

arXiv:1807.03168v112.537 citations

Originality Synthesis-oriented

AI Analysis

This provides a new benchmark for program synthesis researchers working with realistic data, though it is incremental as it focuses on dataset creation rather than methodological advancement.

The authors introduced NAPS, a program synthesis dataset with human-written problem statements and solutions from programming competitions, to enable work with real user-generated data. Their best baseline model achieved only 8.8% accuracy, highlighting the dataset's complexity and potential for future research.

We present a program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems. The problem statements were collected via crowdsourcing and the program solutions were extracted from human-written solutions in programming competitions, accompanied by input/output examples. We propose using this dataset for the program synthesis tasks aimed for working with real user-generated data. As a baseline we present few models, with the best model achieving 8.8% accuracy, showcasing both the complexity of the dataset and large room for future research.

View on arXiv PDF

Similar