AS CL SDSep 14, 2023

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

arXiv:2309.07648v21.21 citationsh-index: 16

Originality Incremental advance

AI Analysis

This addresses the problem of accurate named entity recognition in speech recognition systems, which is critical for semantic understanding, but the approach appears incremental as it builds on existing factorized neural transducer methods.

The paper tackles the challenge of named entity recognition in speech recognition by proposing C-FNT, a model that incorporates class-based language models into factorized neural transducers, resulting in significantly reduced errors in named entities without degrading general word recognition performance.

Despite advancements of end-to-end (E2E) models in speech recognition, named entity recognition (NER) is still challenging but critical for semantic understanding. Previous studies mainly focus on various rule-based or attention-based contextual biasing algorithms. However, their performance might be sensitive to the biasing weight or degraded by excessive attention to the named entity list, along with a risk of false triggering. Inspired by the success of the class-based language model (LM) in NER in conventional hybrid systems and the effective decoupling of acoustic and linguistic information in the factorized neural Transducer (FNT), we propose C-FNT, a novel E2E model that incorporates class-based LMs into FNT. In C-FNT, the LM score of named entities can be associated with the name class instead of its surface form. The experimental results show that our proposed C-FNT significantly reduces error in named entities without hurting performance in general word recognition.

View on arXiv PDF

Similar