No More Marching: Learning Humanoid Locomotion for Short-Range SE(2) Targets
This work addresses the need for fast, robust, and energy-efficient short-range movements in humanoids for real-world workspaces, representing an incremental improvement over existing learning-based methods.
The paper tackled the problem of inefficient, marching-style locomotion in humanoids for short-range SE(2) target poses by developing a reinforcement learning approach with a constellation-based reward function, resulting in improved energy consumption, time-to-target, and footstep count compared to standard methods and successful simulation-to-hardware transfer.
Humanoids operating in real-world workspaces must frequently execute task-driven, short-range movements to SE(2) target poses. To be practical, these transitions must be fast, robust, and energy efficient. While learning-based locomotion has made significant progress, most existing methods optimize for velocity-tracking rather than direct pose reaching, resulting in inefficient, marching-style behavior when applied to short-range tasks. In this work, we develop a reinforcement learning approach that directly optimizes humanoid locomotion for SE(2) targets. Central to this approach is a new constellation-based reward function that encourages natural and efficient target-oriented movement. To evaluate performance, we introduce a benchmarking framework that measures energy consumption, time-to-target, and footstep count on a distribution of SE(2) goals. Our results show that the proposed approach consistently outperforms standard methods and enables successful transfer from simulation to hardware, highlighting the importance of targeted reward design for practical short-range humanoid locomotion.