LG SD ASDec 28, 2020

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models

Apoorv Vyas, Srikanth Madikeri, Hervé Bourlard

arXiv:2012.14252v24.214 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides significant WER improvements for speech recognition systems by effectively adapting self-supervised pretrained acoustic models, which is beneficial for researchers and practitioners working on ASR.

This paper proposes lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic models. They show that fine-tuning with LFMMI consistently yields relative WER improvements of 10% and 35.3% on Librispeech (100h) clean and other test sets, 10.8% on Switchboard (300h), and 4.3% on Swahili (38h) and 4.4% on Tagalog (84h) compared to a supervised baseline.

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours of untranscribed Librispeech data followed by supervised adaptation with LFMMI on three different datasets. Our results show that fine-tuning with LFMMI, we consistently obtain relative WER improvements of 10% and 35.3% on the clean and other test sets of Librispeech (100h), 10.8% on Switchboard (300h), and 4.3% on Swahili (38h) and 4.4% on Tagalog (84h) compared to the baseline trained only with supervised data.

View on arXiv PDF Code

Similar