Targeted Efficient Fine-tuning: Optimizing Parameter Updates with Data-Driven Sample Selection
This work addresses computational inefficiency in fine-tuning for NLP practitioners, but it is incremental as it builds on existing FISH Mask methods.
The paper tackles the problem of inefficient parameter selection in fine-tuning large language models by proposing the Iterative Range Decreasing (IRD) algorithm, which optimizes sample-parameter pair selection using Fisher information and achieves improved performance on the GLUE benchmark compared to baseline methods.
Fine-tuning all parameters of Large Language Models (LLMs) is computationally expensive. Parameter-Efficient Fine-Tuning (PEFT) methods address this by selectively fine-tuning specific parameters. Most of the parameter efficient fine-tuning (PEFT) methods center on selecting or introducing a set of parameters to be fine-tuned. However, there are few methods that consider the impact of data samples on parameter selecting. Representative data driven methods include FISH Mask based method, which randomly selects a portion of data samples as a basis when selecting parameters. However, this random data sample selection method cannot select optimal parameters for unstable data distribution. In this work, we introduce a data-centric approach and propose the Iterative Range Decreasing (IRD) algorithm to optimize the sample-parameter pair selection in FISH Mask. IRD iteratively refines the selection by identifying subsets of samples and parameters exhibiting higher Fisher information. We demonstrate the effectiveness and rationality of proposed strategy by conducting experiments on GLUE benchmark. Experimental results show our strategy optimizes the parameter selection and achieves preferable performance over some typical baseline methods.