CVSep 28, 2015

Hyper-Fisher Vectors for Action Recognition

arXiv:1509.08439v11 citations
Originality Incremental advance
AI Analysis

This work addresses action recognition for video analysis, presenting an incremental improvement in encoding methods.

The paper tackles action recognition in videos by proposing a novel encoding scheme called Hyper-Fisher Vectors, which combines Fisher vector and bag-of-words encodings, resulting in a 2-3% performance improvement over improved Fisher Vector encoding on datasets like YouTube and HMDB51.

In this paper, a novel encoding scheme combining Fisher vector and bag-of-words encodings has been proposed for recognizing action in videos. The proposed Hyper-Fisher vector encoding is sum of local Fisher vectors which are computed based on the traditional Bag-of-Words (BoW) encoding. Thus, the proposed encoding is simple and yet an effective representation over the traditional Fisher Vector encoding. By extensive evaluation on challenging action recognition datasets, viz., Youtube, Olympic Sports, UCF50 and HMDB51, we show that the proposed Hyper-Fisher Vector encoding improves the recognition performance by around 2-3% compared to the improved Fisher Vector encoding. We also perform experiments to show that the performance of the Hyper-Fisher Vector is robust to the dictionary size of the BoW encoding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes