ROAICVAug 21, 2025

Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning

arXiv:2508.15874v24 citationsh-index: 9
Originality Highly original
AI Analysis

This work addresses the limitation of spatial awareness in robotic control for complex environments, representing a strong specific gain in the field.

The paper tackles the problem of visuomotor robotic manipulation by introducing Spatial Policy (SP), a framework that improves spatial awareness and reasoning, resulting in over 33% improvement on Meta-World and over 25% on iTHOR across 23 tasks.

Vision-centric hierarchical embodied models have demonstrated strong potential. However, existing methods lack spatial awareness capabilities, limiting their effectiveness in bridging visual plans to actionable control in complex environments. To address this problem, we propose Spatial Policy (SP), a unified spatial-aware visuomotor robotic manipulation framework via explicit spatial modeling and reasoning. Specifically, we first design a spatial-conditioned embodied video generation module to model spatially guided predictions through the spatial plan table. Then, we propose a flow-based action prediction module to infer executable actions with coordination. Finally, we propose a spatial reasoning feedback policy to refine the spatial plan table via dual-stage replanning. Extensive experiments show that SP substantially outperforms state-of-the-art baselines, achieving over 33% improvement on Meta-World and over 25% improvement on iTHOR, demonstrating strong effectiveness across 23 embodied control tasks. We additionally evaluate SP in real-world robotic experiments to verify its practical viability. SP enhances the practicality of embodied models for robotic control applications. Code and checkpoints are maintained at https://plantpotatoonmoon.github.io/SpatialPolicy/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes