ROJun 4

PHUMA: Physically Reliable Humanoid Locomotion Dataset

arXiv:2510.2623691.911 citationsh-index: 11Has Code
AI Analysis

For humanoid locomotion researchers, this provides a scalable, physically consistent dataset that improves imitation learning performance and real-world transfer.

PHUMA introduces a physically reliable humanoid locomotion dataset (73 hours) combining motion capture and internet videos via physics-aware curation and retargeting, achieving higher success rates on motion tracking benchmarks than AMASS and Humanoid-X, with zero-shot transfer to a real Unitree G1 robot.

Motion imitation is a promising approach for humanoid locomotion, enabling agents to acquire humanlike behaviors. Existing methods typically rely on high-quality motion capture datasets such as AMASS, but these are scarce and expensive, limiting scalability and diversity. Recent studies attempt to scale data collection by converting large-scale internet videos, exemplified by Humanoid-X. However, they often suffer from physical artifacts such as floating, penetration, and foot skating, which hinder stable imitation. To address this, we introduce PHUMA, a Physically Reliable HUMAnoid locomotion dataset produced by a two-stage pipeline combining physics-aware curation and physics-constrained retargeting, aggregating both motion capture and internet video into a physically reliable, 73-hour corpus. On motion tracking benchmarks, PHUMA-trained policies achieve higher success rates than those trained on AMASS and Humanoid-X, and successfully transfer zero-shot to a real Unitree G1. The code is available at https://davian-robotics.github.io/PHUMA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes