CVAug 30, 2017

Two-stream Flow-guided Convolutional Attention Networks for Action Recognition

arXiv:1708.09268v162 citations
Originality Incremental advance
AI Analysis

It addresses action recognition for video analysis, but is incremental as it builds on existing two-stream networks with cross-link layers.

The paper tackled action recognition in videos by using optical flows to guide attention to human foregrounds, achieving promising results on UCF101, HMDB51, and Hollywood2 datasets.

This paper proposes a two-stream flow-guided convolutional attention networks for action recognition in videos. The central idea is that optical flows, when properly compensated for the camera motion, can be used to guide attention to the human foreground. We thus develop cross-link layers from the temporal network (trained on flows) to the spatial network (trained on RGB frames). These cross-link layers guide the spatial-stream to pay more attention to the human foreground areas and be less affected by background clutter. We obtain promising performances with our approach on the UCF101, HMDB51 and Hollywood2 datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes