Progtuning: Progressive Fine-tuning Framework for Transformer-based Language Models
This addresses the problem of high computational costs for researchers and practitioners fine-tuning large language models, though it is incremental as it builds on existing parameter-efficient methods.
The paper tackles the inefficiency in fine-tuning large Transformer-based language models by proposing Progtuning, a framework that progressively reduces updated transformer blocks based on contribution, reducing parameters by about 25% while maintaining competitive performance.
Fine-tuning is a promising technique for leveraging Transformer-based language models in downstream tasks. As model sizes continue to grow, updating all model parameters becomes increasingly costly. Parameter-efficient fine-tuning methods effectively address this issue by selectively updating a small subset of parameters. However, fine-tuning and most existing parameter-efficient fine-tuning methods require updating the same number of parameters as the initial size, ignoring the unequal contribution across Transformer blocks and leading to extremely inefficient allocation of computing resources. In this paper, we propose Progtuning, the novel fine-tuning framework combined with progressive learning for Transformer-based language models. Specifically, Progtuning progressively reduces the number of updated transformer blocks based on the contribution. Remarkably, Progtuning optimizes resource allocation and reduces the number of updated parameters by approximately 25\%, while still maintaining competitive performance. And it also exhibits high adaptability with parameter-efficient fine-tuning methods, demonstrating excellent performance across various adaptation scenarios.