RO AI LGJun 20, 2024

HYPERmotion: Learning Hybrid Behavior Planning for Autonomous Loco-manipulation

Jin Wang, Rui Dai, Weijie Wang, Luca Rossini, Francesco Ruscelli, Nikos Tsagarakis

arXiv:2406.14655v112.216 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of versatility and adaptability in autonomous robot loco-manipulation for applications such as household chores and work assistance, representing an incremental improvement over existing methods.

The paper tackles the problem of enabling robots to autonomously perform hybrid motions for long-horizon tasks like material handling by proposing HYPERmotion, a framework that combines reinforcement learning with whole-body optimization and leverages large language models for planning, resulting in learned motions that efficiently adapt to new tasks with high autonomy from free-text commands in unstructured scenes.

Enabling robots to autonomously perform hybrid motions in diverse environments can be beneficial for long-horizon tasks such as material handling, household chores, and work assistance. This requires extensive exploitation of intrinsic motion capabilities, extraction of affordances from rich environmental information, and planning of physical interaction behaviors. Despite recent progress has demonstrated impressive humanoid whole-body control abilities, they struggle to achieve versatility and adaptability for new tasks. In this work, we propose HYPERmotion, a framework that learns, selects and plans behaviors based on tasks in different scenarios. We combine reinforcement learning with whole-body optimization to generate motion for 38 actuated joints and create a motion library to store the learned skills. We apply the planning and reasoning features of the large language models (LLMs) to complex loco-manipulation tasks, constructing a hierarchical task graph that comprises a series of primitive behaviors to bridge lower-level execution with higher-level planning. By leveraging the interaction of distilled spatial geometry and 2D observation with a visual language model (VLM) to ground knowledge into a robotic morphology selector to choose appropriate actions in single- or dual-arm, legged or wheeled locomotion. Experiments in simulation and real-world show that learned motions can efficiently adapt to new tasks, demonstrating high autonomy from free-text commands in unstructured scenes. Videos and website: hy-motion.github.io/

View on arXiv PDF

Similar