A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure
This work addresses pose tracking for video analysis, but it is incremental as it combines existing methods like single-image pose estimation and object tracking.
The paper tackles human pose tracking in video by formulating it as a discrete optimization problem using a spatio-temporal pictorial structure model, achieving efficient performance with evaluation on benchmark datasets including a new ICDPose dataset.
In this paper, we present a data-driven approach for human pose tracking in video data. We formulate the human pose tracking problem as a discrete optimization problem based on spatio-temporal pictorial structure model and solve this problem in a greedy framework very efficiently. We propose the model to track the human pose by combining the human pose estimation from single image and traditional object tracking in a video. Our pose tracking objective function consists of the following terms: likeliness of appearance of a part within a frame, temporal displacement of the part from previous frame to the current frame, and the spatial dependency of a part with its parent in the graph structure. Experimental evaluation on benchmark datasets (VideoPose2, Poses in the Wild and Outdoor Pose) as well as on our newly build ICDPose dataset shows the usefulness of our proposed method.