CLMar 28, 2023

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky

arXiv:2303.15647v223.7276 citationsh-index: 35Has Code

Originality Synthesis-oriented

AI Analysis

It addresses the challenge of efficiently fine-tuning large models for practitioners, but is incremental as it synthesizes existing work.

This paper systematically reviews parameter-efficient fine-tuning methods for large language models, analyzing over 50 papers and comparing 15 methods on models up to 11B parameters, finding that some methods struggle in resource-limited settings.

This paper presents a systematic overview of parameter-efficient fine-tuning methods, covering over 50 papers published between early 2019 and mid-2024. These methods aim to address the challenges of fine-tuning large language models by training only a small subset of parameters. We provide a taxonomy that covers a broad range of methods and present a detailed method comparison with a specific focus on real-life efficiency in fine-tuning multibillion-scale language models. We also conduct an extensive head-to-head experimental comparison of 15 diverse PEFT methods, evaluating their performance and efficiency on models up to 11B parameters. Our findings reveal that methods previously shown to surpass a strong LoRA baseline face difficulties in resource-constrained settings, where hyperparameter optimization is limited and the network is fine-tuned only for a few epochs. Finally, we provide a set of practical recommendations for using PEFT methods and outline potential future research directions.

View on arXiv PDF Code

Similar