CLFeb 24, 2025

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Chenghao Fan, Zhenyi Lu, Sichen Liu, Chengfeng Gu, Xiaoye Qu, Wei Wei, Yu Cheng

arXiv:2502.16894v321.826 citationsh-index: 12Has CodeICML

Originality Incremental advance

AI Analysis

This work addresses the problem of parameter-efficient fine-tuning for LLMs, offering a method that improves LoRA's performance to near full fine-tuning levels, which is incremental as it builds on existing LoRA and MoE approaches.

The paper tackled the performance gap between Low-Rank Adaptation (LoRA) and Full Fine-Tuning (Full FT) for Large Language Models by proposing GOAT, a framework that adaptively integrates singular value decomposition with a Mixture-of-Experts architecture and aligns optimization with a theoretical scaling factor, achieving state-of-the-art results across 25 datasets and closing the gap with Full FT.

While Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning for Large Language Models (LLMs), its performance often falls short of Full Fine-Tuning (Full FT). Current methods optimize LoRA by initializing with static singular value decomposition (SVD) subsets, leading to suboptimal leveraging of pre-trained knowledge. Another path for improving LoRA is incorporating a Mixture-of-Experts (MoE) architecture. However, weight misalignment and complex gradient dynamics make it challenging to adopt SVD prior to the LoRA MoE architecture. To mitigate these issues, we propose \underline{G}reat L\underline{o}R\underline{A} Mixture-of-Exper\underline{t} (GOAT), a framework that (1) adaptively integrates relevant priors using an SVD-structured MoE, and (2) aligns optimization with full fine-tuned MoE by deriving a theoretical scaling factor. We demonstrate that proper scaling, without modifying the architecture or training algorithms, boosts LoRA MoE's efficiency and performance. Experiments across 25 datasets, including natural language understanding, commonsense reasoning, image classification, and natural language generation, demonstrate GOAT's state-of-the-art performance, closing the gap with Full FT.

View on arXiv PDF Code

Similar