LoRA-Augmented Generation (LAG) for Knowledge-Intensive Language Tasks
This addresses the need for efficient expert combination in language models, but it is incremental as it builds on existing adapter and retrieval methods.
The paper tackles the problem of efficiently selecting and combining task-specific LoRA adapters for knowledge-intensive language tasks, achieving superior performance over existing data-free methods.
The proliferation of fine-tuned language model experts for specific tasks and domains signals the need for efficient selection and combination methods. We propose LoRA-Augmented Generation (LAG) for leveraging large libraries of knowledge and task-specific LoRA adapters. LAG requires no additional training or access to data, and efficiently filters, retrieves, and applies experts on a per-token and layer basis. We evaluate LAG on various knowledge-intensive tasks, achieving superior performance over existing data-free methods. We explore scenarios where additional data is available, demonstrating LAG's compatibility with alternative solutions such as retrieval-augmented generation (RAG).