ROAILGLOMar 4, 2025

Diverse Controllable Diffusion Policy with Signal Temporal Logic

arXiv:2503.02924v118 citationsh-index: 4Has CodeIEEE Robot Autom Lett
Originality Incremental advance
AI Analysis

This addresses the challenge of realistic simulation generation for autonomous systems like self-driving and human-robot interactions, offering a novel hybrid approach that combines rule-based and learning-based methods.

The paper tackles the problem of generating controllable, diverse, and rule-compliant behaviors for road participants in simulations by leveraging Signal Temporal Logic (STL) and Diffusion Models, achieving the most diverse rule-compliant trajectories with a runtime 1/17X faster than the second-best approach and the highest diversity, rule satisfaction rate, and least collision rate in closed-loop testing.

Generating realistic simulations is critical for autonomous system applications such as self-driving and human-robot interactions. However, driving simulators nowadays still have difficulty in generating controllable, diverse, and rule-compliant behaviors for road participants: Rule-based models cannot produce diverse behaviors and require careful tuning, whereas learning-based methods imitate the policy from data but are not designed to follow the rules explicitly. Besides, the real-world datasets are by nature "single-outcome", making the learning method hard to generate diverse behaviors. In this paper, we leverage Signal Temporal Logic (STL) and Diffusion Models to learn controllable, diverse, and rule-aware policy. We first calibrate the STL on the real-world data, then generate diverse synthetic data using trajectory optimization, and finally learn the rectified diffusion policy on the augmented dataset. We test on the NuScenes dataset and our approach can achieve the most diverse rule-compliant trajectories compared to other baselines, with a runtime 1/17X to the second-best approach. In the closed-loop testing, our approach reaches the highest diversity, rule satisfaction rate, and the least collision rate. Our method can generate varied characteristics conditional on different STL parameters in testing. A case study on human-robot encounter scenarios shows our approach can generate diverse and closed-to-oracle trajectories. The annotation tool, augmented dataset, and code are available at https://github.com/mengyuest/pSTL-diffusion-policy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes