LGAIMLAug 19, 2025

Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

arXiv:2508.14285v11 citationsh-index: 3Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP 2025)
Originality Incremental advance
AI Analysis

This addresses computational efficiency and generalization issues for researchers and practitioners fine-tuning LLMs, though it appears incremental as it adapts amortized Bayesian meta-learning from smaller models to LLMs.

The paper tackles the problem of expensive memory and computation costs in fine-tuning large language models with low-rank adaptation while improving generalization to unseen datasets, proposing ABMLL which outperforms existing methods on Unified-QA and CrossFit benchmarks in accuracy and expected calibration error.

Fine-tuning large language models (LLMs) with low-rank adaptaion (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, it is often unclear how well the fine-tuned LLM will generalize, i.e., how well it will perform on unseen datasets. Methods have been proposed to improve generalization by optimizing with in-context prompts, or by using meta-learning to fine-tune LLMs. However, these methods are expensive in memory and computation, requiring either long-context prompts or saving copies of parameters and using second-order gradient updates. To address these challenges, we propose Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs while maintaining its computational efficiency. We reframe task-specific and global parameters in the context of LoRA and use a set of new hyperparameters to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL provides effective generalization and scales to large models such as Llama3-8B. Furthermore, as a result of using a Bayesian framework, ABMLL provides improved uncertainty quantification. We test ABMLL on Unified-QA and CrossFit datasets and find that it outperforms existing methods on these benchmarks in terms of both accuracy and expected calibration error.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes