LG MLJul 14, 2020

Shuffling Recurrent Neural Networks

arXiv:2007.07324v18.536 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for researchers and practitioners in machine learning, addressing gradient problems in RNNs.

The paper tackles the problem of gradient issues in recurrent neural networks by proposing a model where the hidden state is permuted and updated with input, resulting in competitive performance compared to leading baselines.

We propose a novel recurrent neural network model, where the hidden state $h_t$ is obtained by permuting the vector elements of the previous hidden state $h_{t-1}$ and adding the output of a learned function $b(x_t)$ of the input $x_t$ at time $t$. In our model, the prediction is given by a second learned function, which is applied to the hidden state $s(h_t)$. The method is easy to implement, extremely efficient, and does not suffer from vanishing nor exploding gradients. In an extensive set of experiments, the method shows competitive results, in comparison to the leading literature baselines.

View on arXiv PDF Code

Similar