CLSDASMar 26, 2021

BART based semantic correction for Mandarin automatic speech recognition system

arXiv:2104.05507v136 citations
Originality Incremental advance
AI Analysis

This work addresses error correction in Mandarin ASR systems, showing incremental improvements through a hybrid approach.

The paper tackles the problem of spoken language recognition errors in Mandarin automatic speech recognition systems by proposing a Transformer-based semantic correction method with pretrained BART initialization, resulting in a 21.7% relative reduction in character error rate on a 10,000-hour dataset.

Although automatic speech recognition (ASR) systems achieved significantly improvements in recent years, spoken language recognition error occurs which can be easily spotted by human beings. Various language modeling techniques have been developed on post recognition tasks like semantic correction. In this paper, we propose a Transformer based semantic correction method with pretrained BART initialization, Experiments on 10000 hours Mandarin speech dataset show that character error rate (CER) can be effectively reduced by 21.7% relatively compared to our baseline ASR system. Expert evaluation demonstrates that actual improvement of our model surpasses what CER indicates.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes