CLLGNEJul 7, 2016

Sequence Training and Adaptation of Highway Deep Neural Networks

arXiv:1607.01963v56 citations
Originality Synthesis-oriented
AI Analysis

This work addresses incremental improvements in speech recognition for meeting scenarios, focusing on optimizing a specific HDNN architecture.

The paper tackles improving speech recognition accuracy for Highway Deep Neural Networks (HDNNs) by applying sequence-discriminative training and speaker adaptation techniques on the AMI meeting corpus, achieving considerable improvements through updates to tied gate functions.

Highway deep neural network (HDNN) is a type of depth-gated feedforward neural network, which has shown to be easier to train with more hidden layers and also generalise better compared to conventional plain deep neural networks (DNNs). Previously, we investigated a structured HDNN architecture for speech recognition, in which the two gate functions were tied across all the hidden layers, and we were able to train a much smaller model without sacrificing the recognition accuracy. In this paper, we carry on the study of this architecture with sequence-discriminative training criterion and speaker adaptation techniques on the AMI meeting speech recognition corpus. We show that these two techniques improve speech recognition accuracy on top of the model trained with the cross entropy criterion. Furthermore, we demonstrate that the two gate functions that are tied across all the hidden layers are able to control the information flow over the whole network, and we can achieve considerable improvements by only updating these gate functions in both sequence training and adaptation experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes