MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation
This work addresses the problem of accurate human pose estimation in videos for computer vision applications, representing an incremental improvement with a novel method for a known bottleneck.
The authors tackled human pose estimation in videos by developing a deep learning framework that incorporates both color and motion features, reporting significantly better performance than state-of-the-art systems on their new FLIC-motion dataset.
In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features. We propose a new human body pose dataset, FLIC-motion, that extends the FLIC dataset with additional motion features. We apply our architecture to this dataset and report significantly better performance than current state-of-the-art pose detection systems.