CVMar 29, 2021

Busy-Quiet Video Disentangling for Video Classification

arXiv:2103.15584v410 citations
Originality Incremental advance
AI Analysis

This addresses the problem of low processing efficiency in video classification for researchers and practitioners, though it is incremental as it builds on existing two-pathway architectures.

The paper tackles the inefficiency of video models by proposing a method to separate busy motion details from redundant quiet information, resulting in a Busy-Quiet Net that outperforms recent models on datasets like Something-Something V1 and Kinetics400.

In video data, busy motion details from moving regions are conveyed within a specific frequency bandwidth in the frequency domain. Meanwhile, the rest of the frequencies of video data are encoded with quiet information with substantial redundancy, which causes low processing efficiency in existing video models that take as input raw RGB frames. In this paper, we consider allocating intenser computation for the processing of the important busy information and less computation for that of the quiet information. We design a trainable Motion Band-Pass Module (MBPM) for separating busy information from quiet information in raw video data. By embedding the MBPM into a two-pathway CNN architecture, we define a Busy-Quiet Net (BQN). The efficiency of BQN is determined by avoiding redundancy in the feature space processed by the two pathways: one operating on Quiet features of low-resolution, while the other processes Busy features. The proposed BQN outperforms many recent video processing models on Something-Something V1, Kinetics400, UCF101 and HMDB51 datasets.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes