CVMay 28, 2019

Hallucinating Optical Flow Features for Video Classification

arXiv:1905.11799v222 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the efficiency bottleneck in two-stream video classification models for researchers and practitioners dealing with large-scale video data, offering an incremental improvement by replacing expensive optical flow computation with learned predictions.

The paper tackles the high computational cost of extracting optical flow features for video classification by proposing MoNet, a motion hallucination network that predicts optical flow features from appearance features, reducing computational and storage burdens by almost half while improving classification performance.

Appearance and motion are two key components to depict and characterize the video content. Currently, the two-stream models have achieved state-of-the-art performances on video classification. However, extracting motion information, specifically in the form of optical flow features, is extremely computationally expensive, especially for large-scale video classification. In this paper, we propose a motion hallucination network, namely MoNet, to imagine the optical flow features from the appearance features, with no reliance on the optical flow computation. Specifically, MoNet models the temporal relationships of the appearance features and exploits the contextual relationships of the optical flow features with concurrent connections. Extensive experimental results demonstrate that the proposed MoNet can effectively and efficiently hallucinate the optical flow features, which together with the appearance features consistently improve the video classification performances. Moreover, MoNet can help cutting down almost a half of computational and data-storage burdens for the two-stream video classification. Our code is available at: https://github.com/YongyiTang92/MoNet-Features.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes