Recurrent Attention Unit
This work addresses a specific bottleneck in RNN models for sequence learning tasks like image and text processing, offering an incremental improvement by enhancing GRU with attention mechanisms.
The authors tackled the problem of GRU's inability to adaptively focus on important regions in sequence data, which can lead to information redundancy or loss, by proposing the Recurrent Attention Unit (RAU) that integrates an attention gate into GRU, resulting in consistent performance improvements over GRU and other baselines in experiments on image classification, sentiment classification, and language modeling.
Recurrent Neural Network (RNN) has been successfully applied in many sequence learning problems. Such as handwriting recognition, image description, natural language processing and video motion analysis. After years of development, researchers have improved the internal structure of the RNN and introduced many variants. Among others, Gated Recurrent Unit (GRU) is one of the most widely used RNN model. However, GRU lacks the capability of adaptively paying attention to certain regions or locations, so that it may cause information redundancy or loss during leaning. In this paper, we propose a RNN model, called Recurrent Attention Unit (RAU), which seamlessly integrates the attention mechanism into the interior of GRU by adding an attention gate. The attention gate can enhance GRU's ability to remember long-term memory and help memory cells quickly discard unimportant content. RAU is capable of extracting information from the sequential data by adaptively selecting a sequence of regions or locations and pay more attention to the selected regions during learning. Extensive experiments on image classification, sentiment classification and language modeling show that RAU consistently outperforms GRU and other baseline methods.