ROMay 31

PLanAR: Planning-Language-Grounded Agentic Reasoning for Robot Manipulation

arXiv:2602.0166261.31 citationsh-index: 13
Predicted impact top 33% in RO · last 90 daysOriginality Incremental advance
AI Analysis

For robot manipulation researchers, PLanAR provides a structured method to integrate symbolic planning with VLMs for long-horizon tasks, though it is an incremental improvement over existing VLM-based approaches.

PLanAR introduces a planning-language interface that grounds VLM reasoning in object predicates, action schemas, and symbolic plans, enabling stepwise verification and replanning for long-horizon robot manipulation. Across multiple tasks and embodiments, it achieves strong real-world performance while exposing VLM limitations in embodied reasoning.

Recent advances in vision-language models (VLMs) have enabled increasing progress in real-world robot manipulation. However, long-horizon manipulation in unstructured environments requires VLMs to reason about changing scene states, action constraints, and execution outcomes, which remains difficult with natural language reasoning alone. We present PLanAR, a planning-language-grounded robot agent framework for open-vocabulary, long-horizon manipulation. PLanAR uses a planning-language interface to define the VLM reasoning space: object predicates represent scene states, action schemas specify robot skills with preconditions and effects, and symbolic plans provide executable intermediate representations. This interface enables stepwise verification: after each action, PLanAR uses onboard observations to check whether the expected symbolic effects have been achieved, allowing the VLM-based agent to update task states, detect failures, and replan when execution deviates from expectation. Across robot embodiments, VLM backends, and tasks including stacking, crossword solving, and long-horizon kitchen workflows, PLanAR demonstrates strong real-world capability while revealing key limitations of current VLMs in embodied reasoning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes