CLSep 24, 2020

Ape210K: A Large-Scale and Template-Rich Dataset of Math Word Problems

arXiv:2009.11506v293 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides a more challenging benchmark for researchers in math word problem solving, though it is incremental as it builds on existing datasets and methods.

The authors tackled the limitations of existing datasets for automatic math word problem solving by releasing Ape210K, a large-scale and template-rich dataset with 210K Chinese elementary school-level problems, which is 9 times larger and 25 times more diverse than the previous largest dataset, and they proposed a model that outperforms existing ones by 3.2% on the Math23K dataset.

Automatic math word problem solving has attracted growing attention in recent years. The evaluation datasets used by previous works have serious limitations in terms of scale and diversity. In this paper, we release a new large-scale and template-rich math word problem dataset named Ape210K. It consists of 210K Chinese elementary school-level math problems, which is 9 times the size of the largest public dataset Math23K. Each problem contains both the gold answer and the equations needed to derive the answer. Ape210K is also of greater diversity with 56K templates, which is 25 times more than Math23K. Our analysis shows that solving Ape210K requires not only natural language understanding but also commonsense knowledge. We expect Ape210K to be a benchmark for math word problem solving systems. Experiments indicate that state-of-the-art models on the Math23K dataset perform poorly on Ape210K. We propose a copy-augmented and feature-enriched sequence to sequence (seq2seq) model, which outperforms existing models by 3.2% on the Math23K dataset and serves as a strong baseline of the Ape210K dataset. The gap is still significant between human and our baseline model, calling for further research efforts. We make Ape210K dataset publicly available at https://github.com/yuantiku/ape210k

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes