ML LGNov 18, 2017

MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks

arXiv:1711.06788v27.633 citations

Originality Incremental advance

AI Analysis

This work addresses the need for more interpretable and trainable RNNs for researchers and practitioners in machine learning, though it appears incremental as it builds on existing RNN structures.

The authors tackled the problem of complex and hard-to-interpret recurrent neural networks by introducing MinimalRNN, a simplified architecture that achieves comparable performance to gated RNNs while improving interpretability and trainability, as demonstrated through learning disentangled states and capturing longer-range dependencies.

We introduce MinimalRNN, a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability. We demonstrate that by endorsing the more restrictive update rule, MinimalRNN learns disentangled RNN states. We further examine the learning dynamics of different RNN structures using input-output Jacobians, and show that MinimalRNN is able to capture longer range dependencies than existing RNN architectures.

View on arXiv PDF

Similar