CL SD ASJun 19, 2025

Weight Factorization and Centralization for Continual Learning in Speech Recognition

Enes Yavuz Ugan, Ngoc-Quan Pham, Alexander Waibel

arXiv:2506.16574v13 citationsh-index: 13INTERSPEECH

Originality Incremental advance

AI Analysis

This addresses the problem of maintaining model quality in rehearsal-free, multilingual speech recognition systems for downstream applications, but it appears incremental as it builds on existing continual learning methods.

The paper tackles catastrophic forgetting in continual learning for speech recognition by proposing a two-phase approach with factorization and centralization, achieving effective prevention of forgetting in experiments on varied code-switching datasets.

Modern neural network based speech recognition models are required to continually absorb new data without re-training the whole system, especially in downstream applications using foundation models, having no access to the original training data. Continually training the models in a rehearsal-free, multilingual, and language agnostic condition, likely leads to catastrophic forgetting, when a seemingly insignificant disruption to the weights can destructively harm the quality of the models. Inspired by the ability of human brains to learn and consolidate knowledge through the waking-sleeping cycle, we propose a continual learning approach with two distinct phases: factorization and centralization, learning and merging knowledge accordingly. Our experiments on a sequence of varied code-switching datasets showed that the centralization stage can effectively prevent catastrophic forgetting by accumulating the knowledge in multiple scattering low-rank adapters.

View on arXiv PDF

Similar