Unified Walking, Running, and Recovery for Humanoids via State-Dependent Adversarial Motion Priors
For humanoid robot control, this work unifies locomotion and recovery in one policy, reducing engineering complexity, though it is an incremental extension of AMP with a state-dependent gate.
A single reinforcement learning policy enables walking, running, and fall recovery on the Unitree G1 humanoid robot without mode-switching commands, validated on hardware with successful recovery from prone/supine falls and smooth walk-to-run transitions.
We propose a unified reinforcement learning framework that enables a single policy to perform walking, running, and fall recovery on the Unitree G1 humanoid robot, validated on physical hardware without any explicit mode-switching command at deployment. The framework extends Adversarial Motion Priors (AMP) by replacing the conventional global reference distribution with a state-dependent gate that routes each training transition to one of two discriminators: a dedicated recovery discriminator and a velocity-conditioned locomotion discriminator that jointly covers walking and running. The gate is defined by a single fixed threshold on projected gravity: the recovery discriminator is activated when body tilt exceeds approximately $37^\circ$ from vertical ($|g_z+1|>0.6$); otherwise the locomotion discriminator is used, with the normalized commanded velocity serving as a condition that selects the appropriate reference trajectory between walk and run clips. Only three LAFAN1 reference clips are required to regularize the complete behavior set. At deployment, a single frozen ONNX policy executes at 50\,Hz with no runtime mode logic; hardware experiments demonstrate successful recovery from both prone and supine falls and smooth walk-to-run transitions under the same controller.