ASCLLGSDOct 30, 2018

Bi-Directional Lattice Recurrent Neural Networks for Confidence Estimation

arXiv:1810.13024v226 citations
Originality Incremental advance
AI Analysis

This work addresses reliability issues in confidence estimation for upstream and downstream applications like speaker adaptation and semi-supervised training, but it is incremental as it extends existing BiRNN methods to more complex structures.

The paper tackles the problem of improving confidence scores for all words in confusion networks or lattices in automatic speech recognition, showing that extending bi-directional recurrent neural networks (BiRNNs) to these structures provides significant improvement in confidence estimation.

The standard approach to mitigate errors made by an automatic speech recognition system is to use confidence scores associated with each predicted word. In the simplest case, these scores are word posterior probabilities whilst more complex schemes utilise bi-directional recurrent neural network (BiRNN) models. A number of upstream and downstream applications, however, rely on confidence scores assigned not only to 1-best hypotheses but to all words found in confusion networks or lattices. These include but are not limited to speaker adaptation, semi-supervised training and information retrieval. Although word posteriors could be used in those applications as confidence scores, they are known to have reliability issues. To make improved confidence scores more generally available, this paper shows how BiRNNs can be extended from 1-best sequences to confusion network and lattice structures. Experiments are conducted using one of the Cambridge University submissions to the IARPA OpenKWS 2016 competition. The results show that confusion network and lattice-based BiRNNs can provide a significant improvement in confidence estimation.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes