CVAILGMar 26, 2023

Frame Flexible Network

arXiv:2303.14817v15 citationsh-index: 38Has Code
Originality Incremental advance
AI Analysis

This addresses the need for flexible and efficient video recognition models, reducing repetitive training and storage costs, though it is incremental as it builds on existing architectures.

The paper tackles the Temporal Frequency Deviation phenomenon in video recognition, where models trained on specific frame counts perform poorly on others, and proposes the Frame Flexible Network (FFN) to enable evaluation at different frames, achieving gains like 7.08%, 5.15%, and 2.17% on the Something-Something V1 dataset.

Existing video recognition algorithms always conduct different training pipelines for inputs with different frame numbers, which requires repetitive training operations and multiplying storage costs. If we evaluate the model using other frames which are not used in training, we observe the performance will drop significantly (see Fig.1), which is summarized as Temporal Frequency Deviation phenomenon. To fix this issue, we propose a general framework, named Frame Flexible Network (FFN), which not only enables the model to be evaluated at different frames to adjust its computation, but also reduces the memory costs of storing multiple models significantly. Concretely, FFN integrates several sets of training sequences, involves Multi-Frequency Alignment (MFAL) to learn temporal frequency invariant representations, and leverages Multi-Frequency Adaptation (MFAD) to further strengthen the representation abilities. Comprehensive empirical validations using various architectures and popular benchmarks solidly demonstrate the effectiveness and generalization of FFN (e.g., 7.08/5.15/2.17% performance gain at Frame 4/8/16 on Something-Something V1 dataset over Uniformer). Code is available at https://github.com/BeSpontaneous/FFN.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes