Deep Gate Recurrent Neural Network
This work addresses efficiency and stability issues in RNNs for sequence modeling, but it is incremental as it builds on existing gated units like GRU and LSTM.
The paper tackles the problem of learning long-term dependencies in recurrent neural networks by introducing two new structures, SGU and DSGU, which use fewer parameters and less computation time than LSTM and GRU, achieving faster learning speeds in sequence classification tasks.
This paper introduces two recurrent neural network structures called Simple Gated Unit (SGU) and Deep Simple Gated Unit (DSGU), which are general structures for learning long term dependencies. Compared to traditional Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), both structures require fewer parameters and less computation time in sequence classification tasks. Unlike GRU and LSTM, which require more than one gates to control information flow in the network, SGU and DSGU only use one multiplicative gate to control the flow of information. We show that this difference can accelerate the learning speed in tasks that require long dependency information. We also show that DSGU is more numerically stable than SGU. In addition, we also propose a standard way of representing inner structure of RNN called RNN Conventional Graph (RCG), which helps analyzing the relationship between input units and hidden units of RNN.