QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds
This work addresses the problem of creating versatile quadruped agents for complex real-world scenarios, representing an incremental advancement in robotic autonomy.
The authors tackled the challenge of building autonomous quadruped robots that can navigate, adapt, and respond to diverse commands in open-ended worlds, resulting in an agent that shows proficiency in handling diverse tasks and intricate instructions.
As robotic agents increasingly assist humans in reality, quadruped robots offer unique opportunities for interaction in complex scenarios due to their agile movement. However, building agents that can autonomously navigate, adapt, and respond to versatile goals remains a significant challenge. In this work, we introduce QuadrupedGPT designed to follow diverse commands with agility comparable to that of a pet. The primary challenges addressed include: i) effectively utilizing multimodal observations for informed decision-making; ii) achieving agile control by integrating locomotion and navigation; iii) developing advanced cognition to execute long-term objectives. Our QuadrupedGPT interprets human commands and environmental contexts using a large multimodal model. Leveraging its extensive knowledge base, the agent autonomously assigns parameters for adaptive locomotion policies and devises safe yet efficient paths toward its goals. Additionally, it employs high-level reasoning to decompose long-term goals into a sequence of executable subgoals. Through comprehensive experiments, our agent shows proficiency in handling diverse tasks and intricate instructions, representing a significant step toward the development of versatile quadruped agents for open-ended environments.