CLNEOct 16, 2014

Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition

arXiv:1410.4281v2323 citations
Originality Incremental advance
AI Analysis

This work addresses speech recognition for large vocabulary applications, but it is incremental as it builds on existing LSTM methods.

The researchers tackled improving speech recognition by developing deep Long Short-Term Memory (LSTM) architectures, achieving state-of-the-art performance on a large vocabulary conversational telephone speech task.

Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks. To achieve a further performance improvement, in this research, deep extensions on LSTM are investigated considering that deep hierarchical model has turned out to be more efficient than a shallow one. Motivated by previous research on constructing deep recurrent neural networks (RNNs), alternative deep LSTM architectures are proposed and empirically evaluated on a large vocabulary conversational telephone speech recognition task. Meanwhile, regarding to multi-GPU devices, the training process for LSTM networks is introduced and discussed. Experimental results demonstrate that the deep LSTM networks benefit from the depth and yield the state-of-the-art performance on this task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes