ROMar 24

NL2SpaTiaL: Generating Geometric Spatio-Temporal Logic Specifications from Natural Language for Manipulation Tasks

arXiv:2512.1367062.5h-index: 4
AI Analysis

This addresses the challenge of scalable and accurate natural language to logic translation for robotics verification, though it is incremental as it builds on existing NL-to-Logic methods by introducing hierarchical structure.

The paper tackles the problem of translating natural language into verifiable Spatio-Temporal Logic (SpaTiaL) specifications for manipulation tasks, proposing NL2SpaTiaL, a framework that models specifications as Hierarchical Logical Trees (HLT) and outperforms flat-generation baselines across various logical depths.

While Temporal Logic provides a rigorous verification framework for robotics, it typically operates on trajectory-level signals and does not natively represent the object-centric geometric relations that are central to manipulation. Spatio-Temporal Logic (SpaTiaL) overcomes this by explicitly capturing geometric spatial requirements, making it a natural formalism for manipulation-task verification. Consequently, translating natural language (NL) into verifiable SpaTiaL specifications is a critical objective. Yet, existing NL-to-Logic methods treat specifications as flat sequences, entangling nested temporal scopes with spatial relations and causing performance to degrade sharply under deep nesting. We propose NL2SpaTiaL, a framework modeling specifications as Hierarchical Logical Trees (HLT). By generating formulas as structured HLTs in a single shot, our approach decouples semantic parsing from syntactic rendering, aligning with human compositional spatial reasoning. To support this, we construct, to the best of our knowledge, the first NL-to-SpaTiaL dataset with explicit hierarchical supervision via a logic-first synthesis pipeline. Experiments with open-weight LLMs demonstrate that our HLT formulation significantly outperforms flat-generation baselines across various logical depths. These results show that explicit HLT structure is critical for scalable NL-to-SpaTiaL translation, ultimately enabling a rigorous ``generate-and-test'' paradigm for verifying candidate trajectories in language-conditioned robotics. Project website: https://sites.google.com/view/nl2spatial

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes