Toward Open-ended Embodied Tasks Solving
This work addresses the problem of task open-endedness for embodied AI, enabling robots to adapt to novel and dynamic goals, though it appears incremental as it builds on existing diffusion and guidance techniques.
The paper tackles the challenge of open-ended tasks for embodied agents like robots, where goals are novel, multifaceted, and dynamic, by introducing the DOG framework that synergizes diffusion models with training-free guidance for adaptive planning and control, demonstrating its ability to handle unseen task goals in maze navigation and robot control.
Empowering embodied agents, such as robots, with Artificial Intelligence (AI) has become increasingly important in recent years. A major challenge is task open-endedness. In practice, robots often need to perform tasks with novel goals that are multifaceted, dynamic, lack a definitive "end-state", and were not encountered during training. To tackle this problem, this paper introduces \textit{Diffusion for Open-ended Goals} (DOG), a novel framework designed to enable embodied AI to plan and act flexibly and dynamically for open-ended task goals. DOG synergizes the generative prowess of diffusion models with state-of-the-art, training-free guidance techniques to adaptively perform online planning and control. Our evaluations demonstrate that DOG can handle various kinds of novel task goals not seen during training, in both maze navigation and robot control problems. Our work sheds light on enhancing embodied AI's adaptability and competency in tackling open-ended goals.