Recurrent Human Pose Estimation
This work addresses pose estimation for computer vision applications, offering a simpler alternative to existing methods, though it appears incremental as it matches rather than surpasses state-of-the-art.
The authors tackled 2D human pose estimation by proposing a ConvNet model that regresses heatmaps for body keypoints, achieving performance on par with state-of-the-art methods on two benchmark datasets without using complex graphical models.
We propose a novel ConvNet model for predicting 2D human body poses in an image. The model regresses a heatmap representation for each body keypoint, and is able to learn and represent both the part appearances and the context of the part configuration. We make the following three contributions: (i) an architecture combining a feed forward module with a recurrent module, where the recurrent module can be run iteratively to improve the performance, (ii) the model can be trained end-to-end and from scratch, with auxiliary losses incorporated to improve performance, (iii) we investigate whether keypoint visibility can also be predicted. The model is evaluated on two benchmark datasets. The result is a simple architecture that achieves performance on par with the state of the art, but without the complexity of a graphical model stage (or layers).