CLJul 6, 2018

Sliced Recurrent Neural Networks

arXiv:1807.02291v11094 citations
Originality Incremental advance
AI Analysis

This addresses the training inefficiency problem for NLP practitioners, offering a significant speed improvement while maintaining or enhancing accuracy, though it is an incremental advancement over existing RNN methods.

The paper tackles the slow training of recurrent neural networks (RNNs) due to their sequential nature by introducing sliced RNNs (SRNNs) that parallelize training by slicing sequences, achieving a 136x speedup over standard RNNs and better performance on sentiment analysis datasets.

Recurrent neural networks have achieved great success in many NLP tasks. However, they have difficulty in parallelization because of the recurrent structure, so it takes much time to train RNNs. In this paper, we introduce sliced recurrent neural networks (SRNNs), which could be parallelized by slicing the sequences into many subsequences. SRNNs have the ability to obtain high-level information through multiple layers with few extra parameters. We prove that the standard RNN is a special case of the SRNN when we use linear activation functions. Without changing the recurrent units, SRNNs are 136 times as fast as standard RNNs and could be even faster when we train longer sequences. Experiments on six largescale sentiment analysis datasets show that SRNNs achieve better performance than standard RNNs.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes