Uncertainty quantification in fine-tuned LLMs using LoRA ensembles
This work addresses uncertainty quantification for fine-tuned LLMs, which is an incremental improvement for researchers and practitioners needing reliable model predictions.
The paper tackled the problem of understanding what fine-tuned large language models learn and forget, and how to trust their predictions, by developing a method for uncertainty quantification using low-rank adaptation ensembles, and found unexpected retention of knowledge during fine-tuning in overfitting regimes.
Fine-tuning large language models can improve task specific performance, although a general understanding of what the fine-tuned model has learned, forgotten and how to trust its predictions is still missing. We derive principled uncertainty quantification for fine-tuned LLMs with posterior approximations using computationally efficient low-rank adaptation ensembles. We analyze three common multiple-choice datasets using low-rank adaptation ensembles based on Mistral-7b, and draw quantitative and qualitative conclusions on their perceived complexity and balance between retained prior knowledge and domain specific adaptation during and after fine-tuning. We identify unexpected retention of acquired knowledge during fine-tuning in the overfitting regime.