CLMay 22, 2017

Use of Knowledge Graph in Rescoring the N-Best List in Automatic Speech Recognition

arXiv:1705.08018v16 citations
Originality Synthesis-oriented
AI Analysis

This addresses accuracy issues in ASR for real-time applications, but it is incremental as it builds on existing rescoring methods.

The paper tackled the problem of low recognition accuracy in automatic speech recognition by using a knowledge graph to compute semantic relatedness between words for rescoring the N-best list, resulting in improved accuracy, though no specific numbers are provided.

With the evolution of neural network based methods, automatic speech recognition (ASR) field has been advanced to a level where building an application with speech interface is a reality. In spite of these advances, building a real-time speech recogniser faces several problems such as low recognition accuracy, domain constraint, and out-of-vocabulary words. The low recognition accuracy problem is addressed by improving the acoustic model, language model, decoder and by rescoring the N-best list at the output of the decoder. We are considering the N-best list rescoring approach to improve the recognition accuracy. Most of the methods in the literature use the grammatical, lexical, syntactic and semantic connection between the words in a recognised sentence as a feature to rescore. In this paper, we have tried to see the semantic relatedness between the words in a sentence to rescore the N-best list. Semantic relatedness is computed using TransE~\cite{bordes2013translating}, a method for low dimensional embedding of a triple in a knowledge graph. The novelty of the paper is the application of semantic web to automatic speech recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes