Complex Evolution Recurrent Neural Networks (ceRNNs)
This work addresses the need for efficient and effective recurrent neural networks for speech recognition, but it is incremental as it modifies an existing method (uRNNs) by selectively dropping constraints.
The paper tackled the problem of understanding the importance of the unitary property in unitary evolution recurrent neural networks (uRNNs) and their performance on large tasks, proposing complex evolution RNNs (ceRNNs) that drop this property selectively, resulting in a 0.8% absolute WER improvement over baseline LSTM acoustic models in speech recognition.
Unitary Evolution Recurrent Neural Networks (uRNNs) have three attractive properties: (a) the unitary property, (b) the complex-valued nature, and (c) their efficient linear operators. The literature so far does not address -- how critical is the unitary property of the model? Furthermore, uRNNs have not been evaluated on large tasks. To study these shortcomings, we propose the complex evolution Recurrent Neural Networks (ceRNNs), which is similar to uRNNs but drops the unitary property selectively. On a simple multivariate linear regression task, we illustrate that dropping the constraints improves the learning trajectory. In copy memory task, ceRNNs and uRNNs perform identically, demonstrating that their superior performance over LSTMs is due to complex-valued nature and their linear operators. In a large scale real-world speech recognition, we find that pre-pending a uRNN degrades the performance of our baseline LSTM acoustic models, while pre-pending a ceRNN improves the performance over the baseline by 0.8% absolute WER.