Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image
This addresses the problem of estimating 3D poses for multiple people from a single image, which is incremental as it extends existing single-person methods to a multi-person setting.
The paper tackles 3D multi-person pose estimation from a single RGB image by proposing a camera distance-aware top-down approach, achieving comparable results to state-of-the-art single-person models and significantly outperforming previous multi-person methods on public datasets.
Although significant improvement has been achieved recently in 3D human pose estimation, most of the previous methods only treat a single-person case. In this work, we firstly propose a fully learning-based, camera distance-aware top-down approach for 3D multi-person pose estimation from a single RGB image. The pipeline of the proposed system consists of human detection, absolute 3D human root localization, and root-relative 3D single-person pose estimation modules. Our system achieves comparable results with the state-of-the-art 3D single-person pose estimation models without any groundtruth information and significantly outperforms previous 3D multi-person pose estimation methods on publicly available datasets. The code is available in https://github.com/mks0601/3DMPPE_ROOTNET_RELEASE , https://github.com/mks0601/3DMPPE_POSENET_RELEASE.