CVDec 12, 2015

RNN Fisher Vectors for Action Recognition and Image Annotation

Guy Lev, Gil Sadeh, Benjamin Klein, Lior Wolf

arXiv:1512.03958v120.2171 citations

Originality Incremental advance

AI Analysis

This work addresses sequence-based tasks in computer vision, such as video analysis and image labeling, with incremental improvements in methodology.

The authors tackled the problem of encoding sequences for action recognition and image annotation by using RNNs as generative models within a Fisher Vector framework, achieving state-of-the-art results and demonstrating transfer learning between these tasks.

Recurrent Neural Networks (RNNs) have had considerable success in classifying and predicting sequences. We demonstrate that RNNs can be effectively used in order to encode sequences and provide effective representations. The methodology we use is based on Fisher Vectors, where the RNNs are the generative probabilistic models and the partial derivatives are computed using backpropagation. State of the art results are obtained in two central but distant tasks, which both rely on sequences: video action recognition and image annotation. We also show a surprising transfer learning result from the task of image annotation to the task of video action recognition.

View on arXiv PDF

Similar