CL AI LGJun 29, 2022

Solving Quantitative Reasoning Problems with Language Models

Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur

DeepMind

arXiv:2206.14858v240.81814 citationsh-index: 44

Originality Highly original

AI Analysis

This addresses a key limitation in language models for applications requiring quantitative reasoning, such as education and science, though it is incremental as it builds on existing pretraining methods.

The paper tackles the problem of language models struggling with quantitative reasoning tasks like college-level math and science, and introduces Minerva, a model that achieves state-of-the-art performance on technical benchmarks and correctly answers nearly a third of over two hundred undergraduate-level problems.

Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks that require quantitative reasoning, such as solving mathematics, science, and engineering problems at the college level. To help close this gap, we introduce Minerva, a large language model pretrained on general natural language data and further trained on technical content. The model achieves state-of-the-art performance on technical benchmarks without the use of external tools. We also evaluate our model on over two hundred undergraduate-level problems in physics, biology, chemistry, economics, and other sciences that require quantitative reasoning, and find that the model can correctly answer nearly a third of them.

View on arXiv PDF

Similar