PHASOR: Phase-Anchored Universal Action Representations for Humanoid Embodiments
For robot learning researchers, this provides a principled approach to learning universal action representations that improve transferability and interpretability across different humanoid embodiments.
This work introduces PHASOR, a factorized action embedding space that leverages motion periodicity to create a cross-embodiment manifold, enabling interpretable and transferable action representations across diverse humanoid robots. The method achieves strong cross-embodiment retrieval and consistent gains on downstream robot tasks.
Learning a good action embedding space is fundamental to scalable robot policy learning, yet existing methods treat action latents as task-specific intermediates rather than first-class representations. The resulting latents are unstructured, embodiment-specific, and weakly tied to motion semantics, limiting interpretability, controllability, and transferability across robots. We position the action embedding space itself as a first-class design target, with downstream policy quality emerging from representation quality. Exploiting motion's intrinsic periodicity, we factorize it into a phase manifold that captures cyclic structure via FFT-parametric coefficients, together with a pose branch that conditions the manifold on non-periodic configuration detail. Combined with motion-semantic distillation, this factorized structure yields a cross-embodiment motion manifold that is interpretable and embodiment-agnostic by design. Anchoring multiple humanoid robots to a shared human-pretrained manifold then produces a unified action embedding space across diverse platforms, achieving strong cross-embodiment retrieval and consistent gains on downstream robot tasks.