CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging
This work addresses parameter efficiency and adaptation granularity in fine-tuning large language models for diverse downstream tasks, representing an incremental improvement over existing MoE-LoRA methods.
The paper tackled the limited parameter efficiency and coarse-grained adaptation in MoE-LoRA architectures for fine-tuning large language models by proposing CoMoL, which uses core space experts and routing to achieve fine-grained adaptation with parameter efficiency comparable to standard LoRA, outperforming existing methods across multiple tasks.
Large language models (LLMs) achieve remarkable performance on diverse downstream and domain-specific tasks via parameter-efficient fine-tuning (PEFT). However, existing PEFT methods, particularly MoE-LoRA architectures, suffer from limited parameter efficiency and coarse-grained adaptation due to the proliferation of LoRA experts and instance-level routing. To address these issues, we propose Core Space Mixture of LoRA (\textbf{CoMoL}), a novel MoE-LoRA framework that incorporates expert diversity, parameter efficiency, and fine-grained adaptation. Specifically, CoMoL introduces two key components: core space experts and core space routing. Core space experts store each expert in a compact core matrix, preserving diversity while controlling parameter growth. Core space routing dynamically selects and activates the appropriate core experts for each token, enabling fine-grained, input-adaptive routing. Activated core experts are then merged via a soft-merging strategy into a single core expert, which is combined with a shared LoRA to form a specialized LoRA module. Besides, the routing network is projected into the same low-rank space as the LoRA matrices, further reducing parameter overhead without compromising expressiveness. Extensive experiments demonstrate that CoMoL retains the adaptability of MoE-LoRA architectures while achieving parameter efficiency comparable to standard LoRA, consistently outperforming existing methods across multiple tasks.