ROAICLMay 9

REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

arXiv:2505.1087257.73 citationsh-index: 2
Predicted impact top 36% in RO · last 90 daysOriginality Incremental advance
AI Analysis

For researchers and developers of LLM-based robot task planners, this work addresses the overlooked problem of vague instructions from non-expert users, making robots more accessible to the elderly and children.

This paper introduces REI-Bench, the first benchmark for robot task planning with vague human instructions, showing that vagueness in referring expressions can degrade planning success rates by up to 36.9%. They propose task-oriented context cognition to generate clear instructions, achieving state-of-the-art performance.

Robot task planning decomposes human instructions into executable action sequences that enable robots to complete a series of complex tasks. Although recent large language model (LLM)-based task planners achieve amazing performance, they assume that human instructions are clear and straightforward. However, real-world users are not experts, and their instructions to robots often contain significant vagueness. Linguists suggest that such vagueness frequently arises from referring expressions (REs), whose meanings depend heavily on dialogue context and environment. This vagueness is even more prevalent among the elderly and children, who are the groups that robots should serve more. This paper studies how such vagueness in REs within human instructions affects LLM-based robot task planning and how to overcome this issue. To this end, we propose the first robot task planning benchmark that systematically models vague REs grounded in pragmatic theory (REI-Bench), where we discover that the vagueness of REs can severely degrade robot planning performance, leading to success rate drops of up to 36.9%. We also observe that most failure cases stem from missing objects in planners. To mitigate the REs issue, we propose a simple yet effective approach: task-oriented context cognition, which generates clear instructions for robots, achieving state-of-the-art performance compared to aware prompts, chains of thought, and in-context learning. By tackling the overlooked issue of vagueness, this work contributes to the research community by advancing real-world task planning and making robots more accessible to non-expert users, e.g., the elderly and children.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes