CLAILGNov 26, 2024

Not All Adapters Matter: Selective Adapter Freezing for Memory-Efficient Fine-Tuning of Language Models

arXiv:2412.03587v212 citationsh-index: 2NAACL
AI Analysis

This addresses memory and computation inefficiencies in fine-tuning large language models for NLP practitioners, though it is incremental as it builds on existing adapter-tuning methods.

The paper tackles the problem of high resource usage in adapter-tuning for fine-tuning language models by proposing Selective Adapter FrEezing (SAFE), which freezes less important adapters early, reducing memory usage by 42.85%, computation by 34.59%, and training time by 11.82% while maintaining or improving task performance.

Transformer-based large-scale pre-trained models achieve great success. Fine-tuning is the standard practice for leveraging these models in downstream tasks. Among the fine-tuning methods, adapter-tuning provides a parameter-efficient fine-tuning by introducing lightweight trainable modules while keeping most pre-trained parameters frozen. However, existing adapter-tuning methods still impose substantial resource usage. Through our investigation, we show that each adapter unequally contributes to both task performance and resource usage. Motivated by this insight, we propose Selective Adapter FrEezing (SAFE), which gradually freezes less important adapters early to reduce unnecessary resource usage while maintaining performance. In our experiments, SAFE reduces memory usage, computation amount, and training time by 42.85\%, 34.59\%, and 11.82\%, respectively, while achieving comparable or better task performance compared to the baseline. We also demonstrate that SAFE induces regularization effect, thereby smoothing the loss landscape, which enables the model to generalize better by avoiding sharp minima.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes