CLLGASJan 4, 2020

Transformer-based language modeling and decoding for conversational speech recognition

arXiv:2001.01140v1
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of efficient decoding for conversational speech recognition, but it appears incremental as it adapts existing transformer methods to a specific framework.

The authors tackled the problem of integrating transformer-based language models into conversational speech recognition by proposing an efficient lattice re-scoring method within a weighted finite-state transducer framework, which leverages the transformer's ability to capture long-range history and avoid sequential computation.

We propose a way to use a transformer-based language model in conversational speech recognition. Specifically, we focus on decoding efficiently in a weighted finite-state transducer framework. We showcase an approach to lattice re-scoring that allows for longer range history captured by a transfomer-based language model and takes advantage of a transformer's ability to avoid computing sequentially.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes