A Dual-Source Approach for 3D Pose Estimation from a Single Image
This addresses the data scarcity problem for researchers and practitioners in computer vision, but it is incremental as it builds on existing methods for 2D pose estimation and 3D retrieval.
The paper tackles the challenge of insufficient training data for 3D pose estimation from single RGB images by proposing a dual-source approach that combines 2D pose annotations with 3D motion capture data, achieving state-of-the-art results and competitive performance even with differing skeleton structures.
One major challenge for 3D pose estimation from a single RGB image is the acquisition of sufficient training data. In particular, collecting large amounts of training data that contain unconstrained images and are annotated with accurate 3D poses is infeasible. We therefore propose to use two independent training sources. The first source consists of images with annotated 2D poses and the second source consists of accurate 3D motion capture data. To integrate both sources, we propose a dual-source approach that combines 2D pose estimation with efficient and robust 3D pose retrieval. In our experiments, we show that our approach achieves state-of-the-art results and is even competitive when the skeleton structure of the two sources differ substantially.