ASCLSDSep 14, 2023

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

arXiv:2309.07648v21 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses the problem of accurate named entity recognition in speech recognition systems, which is critical for semantic understanding, but the approach appears incremental as it builds on existing factorized neural transducer methods.

The paper tackles the challenge of named entity recognition in speech recognition by proposing C-FNT, a model that incorporates class-based language models into factorized neural transducers, resulting in significantly reduced errors in named entities without degrading general word recognition performance.

Despite advancements of end-to-end (E2E) models in speech recognition, named entity recognition (NER) is still challenging but critical for semantic understanding. Previous studies mainly focus on various rule-based or attention-based contextual biasing algorithms. However, their performance might be sensitive to the biasing weight or degraded by excessive attention to the named entity list, along with a risk of false triggering. Inspired by the success of the class-based language model (LM) in NER in conventional hybrid systems and the effective decoupling of acoustic and linguistic information in the factorized neural Transducer (FNT), we propose C-FNT, a novel E2E model that incorporates class-based LMs into FNT. In C-FNT, the LM score of named entities can be associated with the name class instead of its surface form. The experimental results show that our proposed C-FNT significantly reduces error in named entities without hurting performance in general word recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes