PoseFix: Model-agnostic General Human Pose Refinement Network
This addresses the need for a flexible and easy-to-use post-processing step to enhance pose estimation accuracy for applications in human behavior understanding, though it is incremental as it builds on existing error distribution insights.
The paper tackles the problem of refining 2D human pose estimates by proposing a model-agnostic network that uses error statistics to generate synthetic training data, achieving better performance than conventional methods and consistently improving various state-of-the-art pose estimation models on common benchmarks.
Multi-person pose estimation from a 2D image is an essential technique for human behavior understanding. In this paper, we propose a human pose refinement network that estimates a refined pose from a tuple of an input image and input pose. The pose refinement was performed mainly through an end-to-end trainable multi-stage architecture in previous methods. However, they are highly dependent on pose estimation models and require careful model design. By contrast, we propose a model-agnostic pose refinement method. According to a recent study, state-of-the-art 2D human pose estimation methods have similar error distributions. We use this error statistics as prior information to generate synthetic poses and use the synthesized poses to train our model. In the testing stage, pose estimation results of any other methods can be input to the proposed method. Moreover, the proposed model does not require code or knowledge about other methods, which allows it to be easily used in the post-processing step. We show that the proposed approach achieves better performance than the conventional multi-stage refinement models and consistently improves the performance of various state-of-the-art pose estimation methods on the commonly used benchmark. The code is available in this https URL\footnote{\url{https://github.com/mks0601/PoseFix_RELEASE}}.