CopRA: A Progressive LoRA Training Strategy
This incremental improvement addresses parameter-efficient fine-tuning for foundation models, benefiting tasks like model merging and pruning.
The paper tackles the issue of LoRA fine-tuning converging to suboptimal local optima by introducing CopRA, a progressive training strategy with random layer dropping and Shapley value optimization, which enables efficient model merging and improves pruning performance.
Low-Rank Adaptation (LoRA) is a parameter-efficient technique for rapidly fine-tuning foundation models. In standard LoRA training dynamics, models tend to quickly converge to a local optimum near the initialization. However, this local optimum may not be ideal for out-of-distribution data or tasks such as merging and pruning. In this work, we propose a novel progressive training strategy for LoRA with random layer dropping. This strategy also optimizes the Shapley value of LoRA parameters in each layer, treating each layer as a player in a cooperative game. We refer to this method as Cooperative LoRA (CopRA). Our experimental results demonstrate that parameters trained with CopRA exhibit linear mode connectivity, which enables efficient model merging. This also paves the way for federated learning and multi-task learning via LoRA merging. Additionally, by optimizing the Shapley value, CopRA shows superior performance in pruning tasks.