Higher Order Recurrent Neural Networks
This work addresses the challenge of long-term dependency modeling in sequence tasks like language modeling, offering a novel architectural improvement over existing methods.
The authors tackled the problem of modeling long-term dependencies in sequential data by proposing higher order recurrent neural networks (HORNNs), which use more memory units to track preceding states, and achieved state-of-the-art performance on language modeling tasks using Penn Treebank and English text8 datasets, significantly outperforming regular RNNs and LSTMs.
In this paper, we study novel neural network structures to better model long term dependency in sequential data. We propose to use more memory units to keep track of more preceding states in recurrent neural networks (RNNs), which are all recurrently fed to the hidden layers as feedback through different weighted paths. By extending the popular recurrent structure in RNNs, we provide the models with better short-term memory mechanism to learn long term dependency in sequences. Analogous to digital filters in signal processing, we call these structures as higher order RNNs (HORNNs). Similar to RNNs, HORNNs can also be learned using the back-propagation through time method. HORNNs are generally applicable to a variety of sequence modelling tasks. In this work, we have examined HORNNs for the language modeling task using two popular data sets, namely the Penn Treebank (PTB) and English text8 data sets. Experimental results have shown that the proposed HORNNs yield the state-of-the-art performance on both data sets, significantly outperforming the regular RNNs as well as the popular LSTMs.