CLAug 25, 2022

Training a T5 Using Lab-sized Resources

arXiv:2208.12097v18 citationsh-index: 34
Originality Synthesis-oriented
AI Analysis

This addresses the barrier to entry for researchers with limited resources, enabling more accessible model development, though it is incremental in applying known techniques to a new language.

The paper tackles the problem of high resource and time requirements for training large language models by presenting techniques to enable training with modest lab resources in reasonable time, illustrated by training the first T5 model for Danish.

Training large neural language models on large datasets is resource- and time-intensive. These requirements create a barrier to entry, where those with fewer resources cannot build competitive models. This paper presents various techniques for making it possible to (a) train a large language model using resources that a modest research lab might have, and (b) train it in a reasonable amount of time. We provide concrete recommendations for practitioners, which we illustrate with a case study: a T5 model for Danish, the first for this language.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes