Bayesian Low-Rank Factorization for Robust Model Adaptation
This addresses the challenge of robust model adaptation for speech processing in code-switching scenarios, representing an incremental improvement over existing methods like LoRA.
The paper tackles the problem of adapting large speech foundation models to specific domains like multilingual code-switching without overfitting or catastrophic forgetting, by proposing Bayesian factorized adapters that achieve a 54% backward gain with only a 4% drop in new domain performance.
Large speech foundation models achieve strong performance across many domains, but they often require adaptation to handle local needs such as code-switching, where speakers mix languages within the same utterance. Direct fine-tuning of these models risks overfitting to the target domain and overwriting the broad capabilities of the base model. To address this challenge, we explore Bayesian factorized adapters for speech foundation models, which place priors near zero to achieve sparser adaptation matrices and thereby retain general performance while adapting to specific domains. We apply our approach to the Whisper model and evaluate on different multilingual code-switching scenarios. Our results show only minimal adaptation loss while significantly reducing catastrophic forgetting of the base model. Compared to LoRA, our method achieves a backward gain of 54% with only a 4% drop on the new domain. These findings highlight the effectiveness of Bayesian adaptation for fine-tuning speech foundation models without sacrificing generalization.