CLOct 12, 2017

Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR

arXiv:1710.04515v10.3

Originality Synthesis-oriented

AI Analysis

This work addresses speech recognition for applications requiring accurate transcription, but it appears incremental as it combines existing techniques like attention and convolutional networks.

The paper tackled end-to-end automatic speech recognition by proposing a convolutional attention-based sequence-to-sequence neural network, achieving a 15.8% phoneme error rate on the TIMIT dataset.

This thesis introduces the sequence to sequence model with Luong's attention mechanism for end-to-end ASR. It also describes various neural network algorithms including Batch normalization, Dropout and Residual network which constitute the convolutional attention-based seq2seq neural network. Finally the proposed model proved its effectiveness for speech recognition achieving 15.8% phoneme error rate on TIMIT dataset.

View on arXiv PDF

Similar