Statler: State-Maintaining Language Models for Embodied Reasoning
This addresses the challenge of improving embodied reasoning for robotics, representing an incremental advance by introducing state maintenance to existing language model approaches.
The paper tackles the problem of enabling large language models to perform better robot planning by proposing Statler, a framework that prompts models to maintain and track estimates of unobservable world states, which significantly outperforms competing methods like Code-as-Policies on several tasks.
There has been a significant research interest in employing large language models to empower intelligent robots with complex reasoning. Existing work focuses on harnessing their abilities to reason about the histories of their actions and observations. In this paper, we explore a new dimension in which large language models may benefit robotics planning. In particular, we propose Statler, a framework in which large language models are prompted to maintain an estimate of the world state, which are often unobservable, and track its transition as new actions are taken. Our framework then conditions each action on the estimate of the current world state. Despite being conceptually simple, our Statler framework significantly outperforms strong competing methods (e.g., Code-as-Policies) on several robot planning tasks. Additionally, it has the potential advantage of scaling up to more challenging long-horizon planning tasks.