NL2SpaTiaL: Generating Geometric Spatio-Temporal Logic Specifications from Natural Language for Manipulation Tasks
This addresses the challenge of scalable and accurate natural language to logic translation for robotics verification, though it is incremental as it builds on existing NL-to-Logic methods by introducing hierarchical structure.
The paper tackles the problem of translating natural language into verifiable Spatio-Temporal Logic (SpaTiaL) specifications for manipulation tasks, proposing NL2SpaTiaL, a framework that models specifications as Hierarchical Logical Trees (HLT) and outperforms flat-generation baselines across various logical depths.
While Temporal Logic provides a rigorous verification framework for robotics, it typically operates on trajectory-level signals and does not natively represent the object-centric geometric relations that are central to manipulation. Spatio-Temporal Logic (SpaTiaL) overcomes this by explicitly capturing geometric spatial requirements, making it a natural formalism for manipulation-task verification. Consequently, translating natural language (NL) into verifiable SpaTiaL specifications is a critical objective. Yet, existing NL-to-Logic methods treat specifications as flat sequences, entangling nested temporal scopes with spatial relations and causing performance to degrade sharply under deep nesting. We propose NL2SpaTiaL, a framework modeling specifications as Hierarchical Logical Trees (HLT). By generating formulas as structured HLTs in a single shot, our approach decouples semantic parsing from syntactic rendering, aligning with human compositional spatial reasoning. To support this, we construct, to the best of our knowledge, the first NL-to-SpaTiaL dataset with explicit hierarchical supervision via a logic-first synthesis pipeline. Experiments with open-weight LLMs demonstrate that our HLT formulation significantly outperforms flat-generation baselines across various logical depths. These results show that explicit HLT structure is critical for scalable NL-to-SpaTiaL translation, ultimately enabling a rigorous ``generate-and-test'' paradigm for verifying candidate trajectories in language-conditioned robotics. Project website: https://sites.google.com/view/nl2spatial