CLSDASJun 19, 2025

Weight Factorization and Centralization for Continual Learning in Speech Recognition

arXiv:2506.16574v13 citationsh-index: 13INTERSPEECH
Originality Incremental advance
AI Analysis

This addresses the problem of maintaining model quality in rehearsal-free, multilingual speech recognition systems for downstream applications, but it appears incremental as it builds on existing continual learning methods.

The paper tackles catastrophic forgetting in continual learning for speech recognition by proposing a two-phase approach with factorization and centralization, achieving effective prevention of forgetting in experiments on varied code-switching datasets.

Modern neural network based speech recognition models are required to continually absorb new data without re-training the whole system, especially in downstream applications using foundation models, having no access to the original training data. Continually training the models in a rehearsal-free, multilingual, and language agnostic condition, likely leads to catastrophic forgetting, when a seemingly insignificant disruption to the weights can destructively harm the quality of the models. Inspired by the ability of human brains to learn and consolidate knowledge through the waking-sleeping cycle, we propose a continual learning approach with two distinct phases: factorization and centralization, learning and merging knowledge accordingly. Our experiments on a sequence of varied code-switching datasets showed that the centralization stage can effectively prevent catastrophic forgetting by accumulating the knowledge in multiple scattering low-rank adapters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes