LoRA on the Go: Instance-level Dynamic LoRA Selection and Merging
This addresses the problem of adapting large language models to diverse and unpredictable inputs in real-world settings, offering a practical solution without additional training requirements.
The paper tackles the limitation of conventional LoRA adapters being trained for single tasks by introducing LoRA on the Go (LoGo), a training-free framework that dynamically selects and merges adapters at the instance level, achieving up to a 3.6% performance improvement on some tasks across multiple benchmarks while maintaining inference throughput.
Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for fine-tuning large language models. However, conventional LoRA adapters are typically trained for a single task, limiting their applicability in real-world settings where inputs may span diverse and unpredictable domains. At inference time, existing approaches combine multiple LoRAs for improving performance on diverse tasks, while usually requiring labeled data or additional task-specific training, which is expensive at scale. In this work, we introduce LoRA on the Go (LoGo), a training-free framework that dynamically selects and merges adapters at the instance level without any additional requirements. LoGo leverages signals extracted from a single forward pass through LoRA adapters, to identify the most relevant adapters and determine their contributions on-the-fly. Across 5 NLP benchmarks, 27 datasets, and 3 model families, LoGo outperforms training-based baselines on some tasks upto a margin of 3.6% while remaining competitive on other tasks and maintaining inference throughput, highlighting its effectiveness and practicality.