CL SD ASJul 11, 2018

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

Hosung Park, Donghyun Lee, Minkyu Lim, Yoseb Kang, Juneseok Oh, Ji-Hwan Kim

arXiv:1807.05855v10.22 citations

Originality Incremental advance

AI Analysis

This is an incremental improvement for Korean speech recognition, addressing efficiency in training with limited data.

The authors tackled the problem of slow convergence in Korean speech recognition by proposing a time delay neural network (TDNN) based acoustic model, which achieved a 2.12% absolute improvement in character error rate and converged 1.67 times faster than a feed-forward neural network (FFNN) model.

In this paper, a time delay neural network (TDNN) based acoustic model is proposed to implement a fast-converged acoustic modeling for Korean speech recognition. The TDNN has an advantage in fast-convergence where the amount of training data is limited, due to subsampling which excludes duplicated weights. The TDNN showed an absolute improvement of 2.12% in terms of character error rate compared to feed forward neural network (FFNN) based modelling for Korean speech corpora. The proposed model converged 1.67 times faster than a FFNN-based model did.

View on arXiv PDF

Similar