NELGFeb 14, 2014

A Clockwork RNN

arXiv:1402.3511v1539 citations
Originality Incremental advance
AI Analysis

This addresses the problem of long-term memory in RNNs for researchers and practitioners in sequence prediction, though it is an incremental improvement over existing architectures.

The paper tackled the challenge of training RNNs for long-term dependencies by introducing the Clockwork RNN, which partitions the hidden layer into modules with different temporal granularities, resulting in reduced parameters, improved performance, and faster evaluation in tasks like audio generation and TIMIT classification, outperforming RNN and LSTM networks.

Sequence prediction and classification are ubiquitous and challenging problems in machine learning that can require identifying complex dependencies between temporally distant inputs. Recurrent Neural Networks (RNNs) have the ability, in theory, to cope with these temporal dependencies by virtue of the short-term memory implemented by their recurrent (feedback) connections. However, in practice they are difficult to train successfully when the long-term memory is required. This paper introduces a simple, yet powerful modification to the standard RNN architecture, the Clockwork RNN (CW-RNN), in which the hidden layer is partitioned into separate modules, each processing inputs at its own temporal granularity, making computations only at its prescribed clock rate. Rather than making the standard RNN models more complex, CW-RNN reduces the number of RNN parameters, improves the performance significantly in the tasks tested, and speeds up the network evaluation. The network is demonstrated in preliminary experiments involving two tasks: audio signal generation and TIMIT spoken word classification, where it outperforms both RNN and LSTM networks.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes