CLSep 15, 2023

Vocabulary-level Memory Efficiency for Language Model Fine-tuning

arXiv:2309.08708v212 citationsh-index: 29
Originality Incremental advance
AI Analysis

This addresses memory constraints for researchers and practitioners fine-tuning large language models, though it is incremental as it builds on prior work focused on parameter efficiency.

The paper tackles the memory inefficiency of language model fine-tuning by showing that a large portion of the vocabulary is unused, and proposes a method that reduces memory usage without affecting task performance, achieving substantial reductions across various models and tasks.

The extensive memory footprint of language model (LM) fine-tuning poses a challenge for both researchers and practitioners. LMs use an embedding matrix to represent extensive vocabularies, forming a substantial proportion of the model parameters. While previous work towards memory-efficient fine-tuning has focused on minimizing the number of trainable parameters, reducing the memory footprint of the embedding matrix has yet to be explored. We first demonstrate that a significant proportion of the vocabulary remains unused during fine-tuning. We then propose a simple yet effective approach that leverages this finding to minimize memory usage. We show that our approach provides substantial reductions in memory usage across a wide range of models and tasks. Notably, our approach does not impact downstream task performance, while allowing more efficient use of computational resources.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes