CVLGJan 25, 2016

Survey on the attention based RNN model and its applications in computer vision

arXiv:1601.06823v1130 citations
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers in computer vision, but is incremental as it synthesizes existing work without introducing new methods.

This survey reviews attention-based RNN models that address the challenge of exploring implicit relations between input and output sequences in sequence-to-sequence problems, highlighting their superiority in computer vision applications through experimental results.

The recurrent neural networks (RNN) can be used to solve the sequence to sequence problem, where both the input and the output have sequential structures. Usually there are some implicit relations between the structures. However, it is hard for the common RNN model to fully explore the relations between the sequences. In this survey, we introduce some attention based RNN models which can focus on different parts of the input for each output item, in order to explore and take advantage of the implicit relations between the input and the output items. The different attention mechanisms are described in detail. We then introduce some applications in computer vision which apply the attention based RNN models. The superiority of the attention based RNN model is shown by the experimental results. At last some future research directions are given.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes