CL LGMar 3, 2019

Predicting Algorithm Classes for Programming Word Problems

Vinayak Athavale, Aayush Naik, Rajas Vanjape, Manish Shrivastava

arXiv:1903.00830v230.1995 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of automating algorithm classification for educational or programming assistance tools, but it is incremental as it applies existing text classification methods to a new domain.

The paper tackles the task of predicting algorithm classes for programming word problems by introducing new datasets and posing it as a text classification problem, achieving 62.7% accuracy on a multiclass dataset and showing the best classifier is only 9% less accurate than humans.

We introduce the task of algorithm class prediction for programming word problems. A programming word problem is a problem written in natural language, which can be solved using an algorithm or a program. We define classes of various programming word problems which correspond to the class of algorithms required to solve the problem. We present four new datasets for this task, two multiclass datasets with 550 and 1159 problems each and two multilabel datasets having 3737 and 3960 problems each. We pose the problem as a text classification problem and train neural network and non-neural network-based models on this task. Our best performing classifier gets an accuracy of 62.7 percent for the multiclass case on the five class classification dataset, Codeforces Multiclass-5 (CFMC5). We also do some human-level analysis and compare human performance with that of our text classification models. Our best classifier has an accuracy only 9 percent lower than that of a human on this task. To the best of our knowledge, these are the first reported results on such a task. We make our code and datasets publicly available.

View on arXiv PDF

Similar