A Training Method For VideoPose3D With Ideology of Action Recognition
This incremental improvement addresses faster and more flexible training for video-based human pose estimation, benefiting researchers and practitioners in computer vision.
The paper tackles the problem of improving VideoPose3D training by integrating action recognition, resulting in a method that achieves similar results with less data for pose estimation and outperforms the original by 4.5% on action-oriented tasks in terms of Velocity Error of MPJPE.
Action recognition and pose estimation from videos are closely related to understand human motions, but more literature focuses on how to solve pose estimation tasks alone from action recognition. This research shows a faster and more flexible training method for VideoPose3D which is based on action recognition. This model is fed with the same type of action as the type that will be estimated, and different types of actions can be trained separately. Evidence has shown that, for common pose-estimation tasks, this model requires a relatively small amount of data to carry out similar results with the original research, and for action-oriented tasks, it outperforms the original research by 4.5% with a limited receptive field size and training epoch on Velocity Error of MPJPE. This model can handle both action-oriented and common pose-estimation problems.