LGNENov 30, 2021

Show Your Work: Scratchpads for Intermediate Computation with Language Models

arXiv:2112.00114v11033 citations
Originality Highly original
AI Analysis

This addresses a key bottleneck in AI for tasks requiring sequential reasoning, offering a practical solution for enhancing model capabilities in computation-heavy domains.

The paper tackles the problem of language models struggling with unbounded multi-step computations, such as adding integers or executing programs, by introducing a 'scratchpad' method where models show intermediate steps; the result is a dramatic improvement in performance on complex tasks, including long addition and program execution, even in few-shot settings.

Large pre-trained language models perform remarkably well on tasks that can be done "in one pass", such as generating realistic text or synthesizing computer programs. However, they struggle with tasks that require unbounded multi-step computation, such as adding integers or executing programs. Surprisingly, we find that these same models are able to perform complex multi-step computations -- even in the few-shot regime -- when asked to perform the operation "step by step", showing the results of intermediate computations. In particular, we train transformers to perform multi-step computations by asking them to emit intermediate computation steps into a "scratchpad". On a series of increasingly complex tasks ranging from long addition to the execution of arbitrary programs, we show that scratchpads dramatically improve the ability of language models to perform multi-step computations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes