NE LG MLApr 29, 2019

Learning Longer-term Dependencies via Grouped Distributor Unit

arXiv:1906.08856v14.01 citations

Originality Incremental advance

AI Analysis

This addresses a key challenge in sequence modeling for machine learning practitioners, offering a simpler and more effective RNN variant, though it appears incremental as it builds on existing gated RNN architectures.

The paper tackles the problem of learning long-term dependencies in recurrent neural networks by proposing a novel gated RNN structure called the grouped distributor unit (GDU), which uses a single gate and groups hidden states to adaptively distribute memory updates, and it demonstrates experimental outperformance over LSTM and GRU on tasks including pathological problems and natural datasets.

Learning long-term dependencies still remains difficult for recurrent neural networks (RNNs) despite their success in sequence modeling recently. In this paper, we propose a novel gated RNN structure, which contains only one gate. Hidden states in the proposed grouped distributor unit (GDU) are partitioned into groups. For each group, the proportion of memory to be overwritten in each state transition is limited to a constant and is adaptively distributed to each group member. In other word, every separate group has a fixed overall update rate, yet all units are allowed to have different paces. Information is therefore forced to be latched in a flexible way, which helps the model to capture long-term dependencies in data. Besides having a simpler structure, GDU is demonstrated experimentally to outperform LSTM and GRU on tasks including both pathological problems and natural data set.

View on arXiv PDF

Similar