CLSDASJul 11, 2018

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

arXiv:1807.05855v12 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for Korean speech recognition, addressing efficiency in training with limited data.

The authors tackled the problem of slow convergence in Korean speech recognition by proposing a time delay neural network (TDNN) based acoustic model, which achieved a 2.12% absolute improvement in character error rate and converged 1.67 times faster than a feed-forward neural network (FFNN) model.

In this paper, a time delay neural network (TDNN) based acoustic model is proposed to implement a fast-converged acoustic modeling for Korean speech recognition. The TDNN has an advantage in fast-convergence where the amount of training data is limited, due to subsampling which excludes duplicated weights. The TDNN showed an absolute improvement of 2.12% in terms of character error rate compared to feed forward neural network (FFNN) based modelling for Korean speech corpora. The proposed model converged 1.67 times faster than a FFNN-based model did.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes