Towards Higher Effective Rank in Parameter-efficient Fine-tuning using Khatri--Rao Product
This addresses the problem of efficient fine-tuning for large models, offering a practical alternative to LoRA with improved performance on high-rank scenarios, though it is incremental as it builds upon existing PEFT methods.
The paper tackled the limitation of low-rank adaptation (LoRA) in parameter-efficient fine-tuning, which struggles with matrices of high effective rank, and introduced KRAdapter, a novel method using the Khatri-Rao product that achieved performance gains on vision-language and large language models up to 8B parameters, particularly on unseen common-sense reasoning tasks.
Parameter-efficient fine-tuning (PEFT) has become a standard approach for adapting large pre-trained models. Amongst PEFT methods, low-rank adaptation (LoRA) has achieved notable success. However, recent studies have highlighted its limitations compared against full-rank alternatives, particularly when applied to multimodal and large language models. In this work, we present a quantitative comparison amongst full-rank and low-rank PEFT methods using a synthetic matrix approximation benchmark with controlled spectral properties. Our results confirm that LoRA struggles to approximate matrices with relatively flat spectrums or high frequency components -- signs of high effective ranks. To this end, we introduce KRAdapter, a novel PEFT algorithm that leverages the Khatri-Rao product to produce weight updates, which, by construction, tends to produce matrix product with a high effective rank. We demonstrate performance gains with KRAdapter on vision-language models up to 1B parameters and on large language models up to 8B parameters, particularly on unseen common-sense reasoning tasks. In addition, KRAdapter maintains the memory and compute efficiency of LoRA, making it a practical and robust alternative to fine-tune billion-scale parameter models.