CVOct 25, 2024

Model merging with SVD to tie the Knots

George Stoica, Pratik Ramesh, Boglarka Ecsedi, Leshem Choshen, Judy Hoffman

CMU

arXiv:2410.19735v132.093 citationsh-index: 12Has CodeICLR

Originality Incremental advance

AI Analysis

This addresses the challenge of efficiently combining specialized models for multi-task AI systems, though it is incremental as it builds on existing merging methods.

The paper tackles the problem of poor performance when merging LoRA fine-tuned models, which show lower weight alignment than fully fine-tuned models, and proposes KnOTS, a method using SVD to align weights before merging, improving performance by up to 4.3% on vision and language benchmarks.

Recent model merging methods demonstrate that the parameters of fully-finetuned models specializing in distinct tasks can be combined into one model capable of solving all tasks without retraining. Yet, this success does not transfer well when merging LoRA finetuned models. We study this phenomenon and observe that the weights of LoRA finetuned models showcase a lower degree of alignment compared to their fully-finetuned counterparts. We hypothesize that improving this alignment is key to obtaining better LoRA model merges, and propose KnOTS to address this problem. KnOTS uses the SVD to jointly transform the weights of different LoRA models into an aligned space, where existing merging methods can be applied. In addition, we introduce a new benchmark that explicitly evaluates whether merged models are general models. Notably, KnOTS consistently improves LoRA merging by up to 4.3% across several vision and language benchmarks, including our new setting. We release our code at: https://github.com/gstoica27/KnOTS.

View on arXiv PDF Code

Similar