ASCLSDJun 1, 2020

Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

arXiv:2006.00782v119 citations
Originality Incremental advance
AI Analysis

This addresses a practical issue for speech recognition systems in multilingual environments where code-switching occurs, offering an incremental improvement to existing methods.

The paper tackles the problem of fine-tuning automatic speech recognition models on code-switched speech, which harms monolingual performance, by proposing the Learning Without Forgetting framework and regularization strategies to maintain accuracy on both types of speech, reporting improvements in Word Error Rate on monolingual and code-switched test sets compared to baselines.

Recently, there has been significant progress made in Automatic Speech Recognition (ASR) of code-switched speech, leading to gains in accuracy on code-switched datasets in many language pairs. Code-switched speech co-occurs with monolingual speech in one or both languages being mixed. In this work, we show that fine-tuning ASR models on code-switched speech harms performance on monolingual speech. We point out the need to optimize models for code-switching while also ensuring that monolingual performance is not sacrificed. Monolingual models may be trained on thousands of hours of speech which may not be available for re-training a new model. We propose using the Learning Without Forgetting (LWF) framework for code-switched ASR when we only have access to a monolingual model and do not have the data it was trained on. We show that it is possible to train models using this framework that perform well on both code-switched and monolingual test sets. In cases where we have access to monolingual training data as well, we propose regularization strategies for fine-tuning models for code-switching without sacrificing monolingual accuracy. We report improvements in Word Error Rate (WER) in monolingual and code-switched test sets compared to baselines that use pooled data and simple fine-tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes