CLOct 8, 2021

A Few More Examples May Be Worth Billions of Parameters

Yuval Kirstain, Patrick Lewis, Sebastian Riedel, Omer Levy

arXiv:2110.04374v123.1305 citationsHas Code

Originality Incremental advance

AI Analysis

This provides insights for researchers and practitioners in machine learning on efficient resource allocation in model training, though it is incremental as it builds on existing scaling literature.

The study examined the trade-off between increasing model parameters and labeled examples across tasks, finding that while scaling parameters always helps, additional examples only benefit tasks with restricted output spaces like classification, extractive QA, and multiple choice, where a few hundred examples can be equivalent to billions of parameters, but not open QA.

We investigate the dynamics of increasing the number of model parameters versus the number of labeled examples across a wide variety of tasks. Our exploration reveals that while scaling parameters consistently yields performance improvements, the contribution of additional examples highly depends on the task's format. Specifically, in open question answering tasks, enlarging the training set does not improve performance. In contrast, classification, extractive question answering, and multiple choice tasks benefit so much from additional examples that collecting a few hundred examples is often "worth" billions of parameters. We hypothesize that unlike open question answering, which involves recalling specific information, solving strategies for tasks with a more restricted output space transfer across examples, and can therefore be learned with small amounts of labeled data.

View on arXiv PDF Code

Similar