LGFeb 17, 2025

Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA

Patryk Marszałek, Klaudia Bałazy, Jacek Tabor, Tomasz Kuśmierczyk

arXiv:2502.12122v214.44 citationsh-index: 4Has CodeEMNLP

Originality Incremental advance

AI Analysis

This addresses uncertainty quantification for fine-tuning large language models, offering a more efficient solution than existing Bayesian variants, though it is incremental in improving LoRA methods.

The paper tackled the problem of overconfident and poorly calibrated models in Low-Rank Adaptation (LoRA) by proposing a parameter-efficient Bayesian LoRA via subspace inference, achieving improved calibration and generalization while maintaining computational efficiency.

Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of large language models by decomposing weight updates into low-rank matrices, significantly reducing storage and computational overhead. While effective, standard LoRA lacks mechanisms for uncertainty quantification, leading to overconfident and poorly calibrated models. Bayesian variants of LoRA address this limitation, but at the cost of a significantly increased number of trainable parameters, partially offsetting the original efficiency gains. Additionally, these models are harder to train and may suffer from unstable convergence. In this work, we propose a novel parameter-efficient Bayesian LoRA via subspace inference, demonstrating that effective uncertainty quantification can be achieved in very low-dimensional parameter spaces. The proposed method achieves strong performance with improved calibration and generalization while maintaining computational efficiency. Our empirical findings show that, with the appropriate projection of the weight space: (1) uncertainty can be effectively modeled in a low-dimensional space, and (2) weight covariances exhibit low ranks.

View on arXiv PDF Code

Similar