CVSep 22, 2025

MoCrop: Training Free Motion Guided Cropping for Efficient Video Action Recognition

arXiv:2509.18473v1h-index: 4Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of computational efficiency for video action recognition in compressed domains, offering a practical solution for real-time deployment, though it is incremental as it builds on existing methods with a novel adaptation.

The paper tackles efficient video action recognition by introducing MoCrop, a training-free motion-guided cropping module that uses motion vectors from compressed video to focus on motion-dense regions, resulting in accuracy improvements of up to +3.5% Top-1 on UCF101 and compute reductions of up to 26.5% fewer FLOPs.

We introduce MoCrop, a motion-aware adaptive cropping module for efficient video action recognition in the compressed domain. MoCrop uses motion vectors that are available in H.264 video to locate motion-dense regions and produces a single clip-level crop that is applied to all I-frames at inference. The module is training free, adds no parameters, and can be plugged into diverse backbones. A lightweight pipeline that includes denoising & merge (DM), Monte Carlo sampling (MCS), and adaptive cropping (AC) via a motion-density submatrix search yields robust crops with negligible overhead. On UCF101, MoCrop improves accuracy or reduces compute. With ResNet-50, it delivers +3.5% Top-1 accuracy at equal FLOPs (attention setting), or +2.4% Top-1 accuracy with 26.5% fewer FLOPs (efficiency setting). Applied to CoViAR, it reaches 89.2% Top-1 accuracy at the original cost and 88.5% Top-1 accuracy while reducing compute from 11.6 to 8.5 GFLOPs. Consistent gains on MobileNet-V3, EfficientNet-B1, and Swin-B indicate strong generality and make MoCrop practical for real-time deployment in the compressed domain. Our code and models are available at https://github.com/microa/MoCrop.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes