Multi-Domain Pose Network for Multi-Person Pose Estimation and Tracking
This work addresses the challenge of training robust models for pose estimation in real-world scenarios, though it is incremental as it builds on existing datasets and methods.
The paper tackles the problem of multi-person pose estimation and tracking by introducing the Multi-Domain Pose Network (MDPN), which treats training on multiple datasets as a multi-domain learning task, achieving the best performance on the PoseTrack ECCV 2018 Challenge using only MPII and COCO datasets.
Multi-person human pose estimation and tracking in the wild is important and challenging. For training a powerful model, large-scale training data are crucial. While there are several datasets for human pose estimation, the best practice for training on multi-dataset has not been investigated. In this paper, we present a simple network called Multi-Domain Pose Network (MDPN) to address this problem. By treating the task as multi-domain learning, our methods can learn a better representation for pose prediction. Together with prediction heads fine-tuning and multi-branch combination, it shows significant improvement over baselines and achieves the best performance on PoseTrack ECCV 2018 Challenge without additional datasets other than MPII and COCO.