Cascaded Pyramid Network for 3D Human Pose Estimation Challenge
This work addresses 3D human pose estimation, a less-studied area compared to 2D, with incremental improvements for computer vision applications.
The paper tackles 3D human pose estimation by proposing a two-stage framework that uses GloabalNet and RefineNet for 2D joint detection and a residual block-based estimator for lifting to 3D, achieving a mean per joint position error of 42.39 on a validation dataset.
Over the past decade, there has been a growing interest in human pose estimation. Although much work has been done on 2D pose estimation, 3D pose estimation has still been relatively studied less. In this paper, we propose a top-bottom based two-stage 3D estimation framework. GloabalNet and RefineNet in our 2D pose estimation process enable us to find occluded or invisible 2D joints while 2D-to-3D pose estimator composed of residual blocks is used to lift 2D joints to 3D joints effectively. The proposed method achieves promising results with mean per joint position error at 42.39 on the validation dataset on `3D Human Pose Estimation within the ECCV 2018 PoseTrack Challenge.'