R-CNNs for Pose Estimation and Action Detection
This work addresses pose and action recognition in images, which is important for computer vision applications, but it appears incremental as it builds on existing R-CNN methods.
The authors tackled pose estimation and action detection in unconstrained images using R-CNNs with task-specific loss functions, achieving state-of-the-art results on the PASCAL VOC dataset and introducing a new dataset for action detection.
We present convolutional neural networks for the tasks of keypoint (pose) prediction and action classification of people in unconstrained images. Our approach involves training an R-CNN detector with loss functions depending on the task being tackled. We evaluate our method on the challenging PASCAL VOC dataset and compare it to previous leading approaches. Our method gives state-of-the-art results for keypoint and action prediction. Additionally, we introduce a new dataset for action detection, the task of simultaneously localizing people and classifying their actions, and present results using our approach.