ROAIFeb 18, 2024

Verifiably Following Complex Robot Instructions with Foundation Models

arXiv:2402.11498v325 citationsh-index: 49ICRA
Originality Highly original
AI Analysis

This addresses the need for robots to disambiguate and ground flexible user instructions in unstructured domains, representing a strong specific gain in robotics.

The paper tackles the problem of enabling robots to follow complex, open-ended instructions in real-world environments by proposing LIMP, an approach that constructs symbolic representations for verifiable behavior synthesis, achieving a 79% success rate on complex spatiotemporal instructions compared to 38% for baselines.

When instructing robots, users want to flexibly express constraints, refer to arbitrary landmarks, and verify robot behavior, while robots must disambiguate instructions into specifications and ground instruction referents in the real world. To address this problem, we propose Language Instruction grounding for Motion Planning (LIMP), an approach that enables robots to verifiably follow complex, open-ended instructions in real-world environments without prebuilt semantic maps. LIMP constructs a symbolic instruction representation that reveals the robot's alignment with an instructor's intended motives and affords the synthesis of correct-by-construction robot behaviors. We conduct a large-scale evaluation of LIMP on 150 instructions across five real-world environments, demonstrating its versatility and ease of deployment in diverse, unstructured domains. LIMP performs comparably to state-of-the-art baselines on standard open-vocabulary tasks and additionally achieves a 79\% success rate on complex spatiotemporal instructions, significantly outperforming baselines that only reach 38\%. See supplementary materials and demo videos at https://robotlimp.github.io

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes