AS LG SD MLJul 1, 2019

LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring

Eugen Beck, Wei Zhou, Ralf Schlüter, Hermann Ney

arXiv:1907.01030v113.434 citations

Originality Incremental advance

AI Analysis

This work addresses a known bottleneck in LVCSR systems for speech recognition applications, offering an incremental improvement in decoding efficiency.

The paper tackled the challenge of efficiently integrating LSTM language models into large vocabulary continuous speech recognition (LVCSR) systems by proposing a method combining first-pass decoding with lattice rescoring, achieving competitive results on Hub5'00 and Librispeech corpora with runtime better than real-time.

LSTM based language models are an important part of modern LVCSR systems as they significantly improve performance over traditional backoff language models. Incorporating them efficiently into decoding has been notoriously difficult. In this paper we present an approach based on a combination of one-pass decoding and lattice rescoring. We perform decoding with the LSTM-LM in the first pass but recombine hypothesis that share the last two words, afterwards we rescore the resulting lattice. We run our systems on GPGPU equipped machines and are able to produce competitive results on the Hub5'00 and Librispeech evaluation corpora with a runtime better than real-time. In addition we shortly investigate the possibility to carry out the full sum over all state-sequences belonging to a given word-hypothesis during decoding without recombination.

View on arXiv PDF

Similar