NESep 8, 2014

Recurrent Neural Network Regularization

arXiv:1409.2329v53002 citations
Originality Incremental advance
AI Analysis

This addresses the issue of poor regularization in RNNs and LSTMs for researchers and practitioners in machine learning, enabling better performance in sequence-based tasks.

The paper tackled the problem of overfitting in Recurrent Neural Networks (RNNs) with LSTM units by introducing a simple regularization technique that correctly applies dropout, resulting in substantial reductions in overfitting across tasks like language modeling, speech recognition, image caption generation, and machine translation.

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.

Code Implementations21 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes