CLOct 12, 2017

Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR

arXiv:1710.04515v1
Originality Synthesis-oriented
AI Analysis

This work addresses speech recognition for applications requiring accurate transcription, but it appears incremental as it combines existing techniques like attention and convolutional networks.

The paper tackled end-to-end automatic speech recognition by proposing a convolutional attention-based sequence-to-sequence neural network, achieving a 15.8% phoneme error rate on the TIMIT dataset.

This thesis introduces the sequence to sequence model with Luong's attention mechanism for end-to-end ASR. It also describes various neural network algorithms including Batch normalization, Dropout and Residual network which constitute the convolutional attention-based seq2seq neural network. Finally the proposed model proved its effectiveness for speech recognition achieving 15.8% phoneme error rate on TIMIT dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes