CVAug 18, 2023

Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching

Jiazheng Xing, Mengmeng Wang, Yudi Ruan, Bofan Chen, Yaowei Guo, Boyu Mu, Guang Dai, Jingdong Wang, Yong Liu

arXiv:2308.09346v116.441 citationsh-index: 73Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of recognizing similar action categories in few-shot learning, offering incremental improvements for video analysis tasks.

The paper tackles the problem of few-shot action recognition by proposing a new framework, GgHM, which improves class prototype construction and matching through graph-guided optimization and hybrid strategies, leading to consistent performance gains on benchmark datasets.

Class prototype construction and matching are core aspects of few-shot action recognition. Previous methods mainly focus on designing spatiotemporal relation modeling modules or complex temporal alignment algorithms. Despite the promising results, they ignored the value of class prototype construction and matching, leading to unsatisfactory performance in recognizing similar categories in every task. In this paper, we propose GgHM, a new framework with Graph-guided Hybrid Matching. Concretely, we learn task-oriented features by the guidance of a graph neural network during class prototype construction, optimizing the intra- and inter-class feature correlation explicitly. Next, we design a hybrid matching strategy, combining frame-level and tuple-level matching to classify videos with multivariate styles. We additionally propose a learnable dense temporal modeling module to enhance the video feature temporal representation to build a more solid foundation for the matching process. GgHM shows consistent improvements over other challenging baselines on several few-shot datasets, demonstrating the effectiveness of our method. The code will be publicly available at https://github.com/jiazheng-xing/GgHM.

View on arXiv PDF Code

Similar