CV ROMar 21, 2024

Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset

Andrea Avogaro, Andrea Toaiari, Federico Cunico, Xiangmin Xu, Haralambos Dafas, Alessandro Vinciarelli, Emma Li, Marco Cristani

arXiv:2403.14447v25.27 citationsh-index: 41IROS

Originality Synthesis-oriented

AI Analysis

This dataset addresses the challenge of analyzing human poses from a robot's viewpoint, which is incremental as it focuses on a specific scenario involving dyadic interactions with a quadruped robot.

The authors tackled the problem of 3D human pose estimation and forecasting from a robot's perspective by introducing the HARPER dataset, which includes synchronized recordings from Spot's stereo cameras and an OptiTrack system, providing ground-truth skeletal representations with sub-millimeter precision and benchmarks for 3D pose estimation, forecasting, and collision prediction.

We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot's perspective, i.e., on the data captured by the robot's sensors. These make 3D body pose analysis challenging because being close to the ground captures humans only partially. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users. The Corpus contains not only the recordings of the built-in stereo cameras of Spot, but also those of a 6-camera OptiTrack system (all recordings are synchronized). This leads to ground-truth skeletal representations with a precision lower than a millimeter. In addition, the Corpus includes reproducible benchmarks on 3D Human Pose Estimation, Human Pose Forecasting, and Collision Prediction, all based on publicly available baseline approaches. This enables future HARPER users to rigorously compare their results with those we provide in this work.

View on arXiv PDF

Similar