CVOct 7, 2021

A Baseline Framework for Part-level Action Parsing and Action Recognition

arXiv:2110.03368v23 citations
AI Analysis

This is an incremental solution for a specific competition in video action analysis.

The paper tackled part-level action parsing and recognition by proposing a baseline framework using YOLOF, HRNet, and CSN, achieving 61.37% mAP on the Kinetics-TPS test set.

This technical report introduces our 2nd place solution to Kinetics-TPS Track on Part-level Action Parsing in ICCV DeeperAction Workshop 2021. Our entry is mainly based on YOLOF for instance and part detection, HRNet for human pose estimation, and CSN for video-level action recognition and frame-level part state parsing. We describe technical details for the Kinetics-TPS dataset, together with some experimental results. In the competition, we achieved 61.37% mAP on the test set of Kinetics-TPS.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes