CVDec 12, 2015

RNN Fisher Vectors for Action Recognition and Image Annotation

arXiv:1512.03958v1171 citations
Originality Incremental advance
AI Analysis

This work addresses sequence-based tasks in computer vision, such as video analysis and image labeling, with incremental improvements in methodology.

The authors tackled the problem of encoding sequences for action recognition and image annotation by using RNNs as generative models within a Fisher Vector framework, achieving state-of-the-art results and demonstrating transfer learning between these tasks.

Recurrent Neural Networks (RNNs) have had considerable success in classifying and predicting sequences. We demonstrate that RNNs can be effectively used in order to encode sequences and provide effective representations. The methodology we use is based on Fisher Vectors, where the RNNs are the generative probabilistic models and the partial derivatives are computed using backpropagation. State of the art results are obtained in two central but distant tasks, which both rely on sequences: video action recognition and image annotation. We also show a surprising transfer learning result from the task of image annotation to the task of video action recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes