CL AIMay 19, 2024

MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models

arXiv:2405.13053v38.215 citationsh-index: 4Has CodeICLR

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in parameter-efficient fine-tuning for LLM deployment, offering an incremental improvement for multi-task inference scenarios.

The paper tackles the challenge of autonomous task sensing and switching in large language models with multiple LoRA adapters by introducing MeteoRA, a framework that reuses adapters via a Mixture-of-Experts architecture, achieving equivalent performance to traditional methods and superior handling of composite tasks, such as solving ten sequential problems in one inference pass.

The pretrain+fine-tune paradigm is foundational for deploying large language models (LLMs) across various downstream applications. Within this framework, Low-Rank Adaptation (LoRA) stands out for its parameter-efficient fine-tuning (PEFT), producing numerous reusable task-specific LoRA adapters. However, this approach requires explicit task intention selection, posing challenges for autonomous task sensing and switching during inference with multiple existing LoRA adapters embedded in a single LLM. In this work, we introduce MeteoRA (Multiple-tasks embedded LoRA), a scalable and efficient framework that reuses multiple task-specific LoRA adapters into the base LLM via a full-mode Mixture-of-Experts (MoE) architecture. This framework also includes novel MoE forward acceleration strategies to address the efficiency challenges of traditional MoE implementations. Our evaluation, using the LlaMA2-13B and LlaMA3-8B base models equipped with 28 existing LoRA adapters through MeteoRA, demonstrates equivalent performance with the traditional PEFT method. Moreover, the LLM equipped with MeteoRA achieves superior performance in handling composite tasks, effectively solving ten sequential problems in a single inference pass, thereby demonstrating the framework's enhanced capability for timely adapter switching.

View on arXiv PDF Code

Similar