CLOct 17, 2016

End-to-end attention-based distant speech recognition with Highway LSTM

arXiv:1610.05361v14 citations
Originality Incremental advance
AI Analysis

This work addresses speech recognition in noisy, distant settings, but it is incremental as it builds on existing attention-based models.

The authors tackled distant speech recognition by extending end-to-end attention-based models with multichannel input and Highway LSTM, achieving improved performance on the AMI benchmark.

End-to-end attention-based models have been shown to be competitive alternatives to conventional DNN-HMM models in the Speech Recognition Systems. In this paper, we extend existing end-to-end attention-based models that can be applied for Distant Speech Recognition (DSR) task. Specifically, we propose an end-to-end attention-based speech recognizer with multichannel input that performs sequence prediction directly at the character level. To gain a better performance, we also incorporate Highway long short-term memory (HLSTM) which outperforms previous models on AMI distant speech recognition task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes