A Discriminative Entity-Aware Language Model for Virtual Assistants
This work addresses a domain-specific issue for virtual assistants by improving ASR accuracy on named entities, but it is incremental as it builds on existing discriminative language modeling approaches.
The paper tackled the problem of poor ASR performance on named entities in virtual assistant requests by incorporating real-world knowledge from a Knowledge Graph into a discriminative language model, achieving over 25% relative sentence error rate reductions on synthesized test sets with minimal degradation on a uniformly sampled test set.
High-quality automatic speech recognition (ASR) is essential for virtual assistants (VAs) to work well. However, ASR often performs poorly on VA requests containing named entities. In this work, we start from the observation that many ASR errors on named entities are inconsistent with real-world knowledge. We extend previous discriminative n-gram language modeling approaches to incorporate real-world knowledge from a Knowledge Graph (KG), using features that capture entity type-entity and entity-entity relationships. We apply our model through an efficient lattice rescoring process, achieving relative sentence error rate reductions of more than 25% on some synthesized test sets covering less popular entities, with minimal degradation on a uniformly sampled VA test set.