CLSDASMLFeb 29, 2020

Voice trigger detection from LVCSR hypothesis lattices using bidirectional lattice recurrent neural networks

arXiv:2003.00304v113 citations
Originality Incremental advance
AI Analysis

This work addresses false triggers in personal assistants, which is an incremental improvement in a domain-specific application.

The paper tackled the problem of reducing false voice triggers in speech-enabled personal assistants by post-processing hypothesis lattices from a large-vocabulary continuous speech recognizer using a Bidirectional Lattice Recurrent Neural Network, resulting in significantly improved detection accuracy over baseline methods.

We propose a method to reduce false voice triggers of a speech-enabled personal assistant by post-processing the hypothesis lattice of a server-side large-vocabulary continuous speech recognizer (LVCSR) via a neural network. We first discuss how an estimate of the posterior probability of the trigger phrase can be obtained from the hypothesis lattice using known techniques to perform detection, then investigate a statistical model that processes the lattice in a more explicitly data-driven, discriminative manner. We propose using a Bidirectional Lattice Recurrent Neural Network (LatticeRNN) for the task, and show that it can significantly improve detection accuracy over using the 1-best result or the posterior.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes