LG AIApr 16, 2025

Activated LoRA: Fine-tuned LLMs for Intrinsics

Kristjan Greenewald, Luis Lastras, Thomas Parnell, Vraj Shah, Lucian Popa, Giulio Zizzo, Chulaka Gunasekara, Ambrish Rawat, David Cox

arXiv:2504.12397v58 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This addresses a bottleneck for users of large language models in dynamic, multiturn applications by enabling faster switching between specialized adapters.

The paper tackles the inefficiency of switching between LoRA adapters in multiturn settings by proposing Activated LoRA (aLoRA), which adapts weights only for tokens after invocation, allowing reuse of the base model's KV cache and improving inference efficiency while maintaining competitive accuracy.

Low-Rank Adaptation (LoRA) has emerged as a highly efficient framework for finetuning the weights of large foundation models, and has become the go-to method for data-driven customization of LLMs. Despite the promise of highly customized behaviors and capabilities, switching between relevant LoRAs in a multiturn setting is inefficient, as the key-value (KV) cache of the entire turn history must be recomputed with the LoRA weights before generation can begin. To address this problem, we propose Activated LoRA (aLoRA), an adapter architecture which modifies the LoRA framework to only adapt weights for the tokens in the sequence after the aLoRA is invoked. This change crucially allows aLoRA to accept the base model's KV cache of the input string, meaning that aLoRA can be instantly activated whenever needed in a chain without recomputing the prior keys and values. This enables building what we call intrinsics, i.e. specialized models invoked to perform well-defined operations on portions of an input chain or conversation that otherwise uses the base model by default. We train a set of aLoRA-based intrinsics models, demonstrating competitive accuracy with standard LoRA while significantly improving inference efficiency. We contributed our Activated LoRA implementation to the Huggingface PEFT library https://github.com/huggingface/peft.

View on arXiv PDF Code

Similar