CLLGMMApr 21, 2024

Mixture of LoRA Experts

arXiv:2404.13628v1190 citationsh-index: 41ICLR
Originality Incremental advance
AI Analysis

This addresses a bottleneck in multi-task fine-tuning for researchers and practitioners using LoRA, though it appears incremental as it builds on existing LoRA fusion methods.

The paper tackles the problem of effectively combining multiple LoRA modules for fine-tuning large pre-trained models across diverse tasks, introducing the Mixture of LoRA Experts (MoLE) approach that achieves superior fusion performance compared to direct arithmetic merging while retaining flexibility.

LoRA has gained widespread acceptance in the fine-tuning of large pre-trained models to cater to a diverse array of downstream tasks, showcasing notable effectiveness and efficiency, thereby solidifying its position as one of the most prevalent fine-tuning techniques. Due to the modular nature of LoRA's plug-and-play plugins, researchers have delved into the amalgamation of multiple LoRAs to empower models to excel across various downstream tasks. Nonetheless, extant approaches for LoRA fusion grapple with inherent challenges. Direct arithmetic merging may result in the loss of the original pre-trained model's generative capabilities or the distinct identity of LoRAs, thereby yielding suboptimal outcomes. On the other hand, Reference tuning-based fusion exhibits limitations concerning the requisite flexibility for the effective combination of multiple LoRAs. In response to these challenges, this paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection. The MoLE approach not only achieves superior LoRA fusion performance in comparison to direct arithmetic merging but also retains the crucial flexibility for combining LoRAs effectively. Extensive experimental evaluations conducted in both the Natural Language Processing (NLP) and Vision & Language (V&L) domains substantiate the efficacy of MoLE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes