LGMLJul 14, 2020

Shuffling Recurrent Neural Networks

arXiv:2007.07324v136 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for researchers and practitioners in machine learning, addressing gradient problems in RNNs.

The paper tackles the problem of gradient issues in recurrent neural networks by proposing a model where the hidden state is permuted and updated with input, resulting in competitive performance compared to leading baselines.

We propose a novel recurrent neural network model, where the hidden state $h_t$ is obtained by permuting the vector elements of the previous hidden state $h_{t-1}$ and adding the output of a learned function $b(x_t)$ of the input $x_t$ at time $t$. In our model, the prediction is given by a second learned function, which is applied to the hidden state $s(h_t)$. The method is easy to implement, extremely efficient, and does not suffer from vanishing nor exploding gradients. In an extensive set of experiments, the method shows competitive results, in comparison to the leading literature baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes