LGSDASDec 28, 2020

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models

arXiv:2012.14252v214 citations
AI Analysis

This work provides significant WER improvements for speech recognition systems by effectively adapting self-supervised pretrained acoustic models, which is beneficial for researchers and practitioners working on ASR.

This paper proposes lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic models. They show that fine-tuning with LFMMI consistently yields relative WER improvements of 10% and 35.3% on Librispeech (100h) clean and other test sets, 10.8% on Switchboard (300h), and 4.3% on Swahili (38h) and 4.4% on Tagalog (84h) compared to a supervised baseline.

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours of untranscribed Librispeech data followed by supervised adaptation with LFMMI on three different datasets. Our results show that fine-tuning with LFMMI, we consistently obtain relative WER improvements of 10% and 35.3% on the clean and other test sets of Librispeech (100h), 10.8% on Switchboard (300h), and 4.3% on Swahili (38h) and 4.4% on Tagalog (84h) compared to the baseline trained only with supervised data.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes