CLJun 7, 2018

Domain Adversarial Training for Accented Speech Recognition

Sining Sun, Ching-Feng Yeh, Mei-Yuh Hwang, Mari Ostendorf, Lei Xie

arXiv:1806.02786v19.3146 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of speech recognition for users with heavy accents, though it is incremental as it builds on existing domain adversarial methods.

The paper tackles the problem of accented speech recognition by using domain adversarial training to learn accent-invariant features, resulting in up to 7.45% relative character error rate reduction on Mandarin accents without transcriptions.

In this paper, we propose a domain adversarial training (DAT) algorithm to alleviate the accented speech recognition problem. In order to reduce the mismatch between labeled source domain data ("standard" accent) and unlabeled target domain data (with heavy accents), we augment the learning objective for a Kaldi TDNN network with a domain adversarial training (DAT) objective to encourage the model to learn accent-invariant features. In experiments with three Mandarin accents, we show that DAT yields up to 7.45% relative character error rate reduction when we do not have transcriptions of the accented speech, compared with the baseline trained on standard accent data only. We also find a benefit from DAT when used in combination with training from automatic transcriptions on the accented data. Furthermore, we find that DAT is superior to multi-task learning for accented speech recognition.

View on arXiv PDF

Similar