AINov 22, 2023

Physical Reasoning and Object Planning for Household Embodied Agents

Ayush Agrawal, Raghav Prabhakar, Anirudh Goyal, Dianbo Liu

MILA

arXiv:2311.13577v25.44 citationsh-index: 35Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of practical decision-making for household robots, though it is incremental as it focuses on dataset creation and evaluation rather than new methods.

The study tackled the problem of household embodied agents selecting substitute objects for tasks by introducing the COAT framework and four QA datasets (2K-70K questions) to analyze reasoning capabilities, revealing insights into utility alignment, contextual dependencies, and object physical states.

In this study, we explore the sophisticated domain of task planning for robust household embodied agents, with a particular emphasis on the intricate task of selecting substitute objects. We introduce the CommonSense Object Affordance Task (COAT), a novel framework designed to analyze reasoning capabilities in commonsense scenarios. This approach is centered on understanding how these agents can effectively identify and utilize alternative objects when executing household tasks, thereby offering insights into the complexities of practical decision-making in real-world environments. Drawing inspiration from factors affecting human decision-making, we explore how large language models tackle this challenge through four meticulously crafted commonsense question-and-answer datasets featuring refined rules and human annotations. Our evaluation of state-of-the-art language models on these datasets sheds light on three pivotal considerations: 1) aligning an object's inherent utility with the task at hand, 2) navigating contextual dependencies (societal norms, safety, appropriateness, and efficiency), and 3) accounting for the current physical state of the object. To maintain accessibility, we introduce five abstract variables reflecting an object's physical condition, modulated by human insights, to simulate diverse household scenarios. Our contributions include insightful human preference mappings for all three factors and four extensive QA datasets (2K, 15k, 60k, 70K questions) probing the intricacies of utility dependencies, contextual dependencies and object physical states. The datasets, along with our findings, are accessible at: https://github.com/Ayush8120/COAT. This research not only advances our understanding of physical commonsense reasoning in language models but also paves the way for future improvements in household agent intelligence.

View on arXiv PDF Code

Similar