AI CL ROFeb 4, 2025

From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios

Yuan Gao, Mattia Piccinini, Korbinian Moller, Amr Alanwar, Johannes Betz

arXiv:2502.02145v46 citationsh-index: 11Has Code

Originality Incremental advance

AI Analysis

This addresses the scalability and effort issues in safety testing for autonomous vehicles, though it is incremental as it builds on existing LLM and simulation methods.

The paper tackles the problem of safety-critical scenario testing for autonomous vehicles by using Large Language Models (LLMs) with prompt engineering to automatically evaluate and generate such scenarios, reducing reliance on handcrafted metrics and showing effective detection and synthesis in simulations.

Ensuring the safety of autonomous vehicles requires virtual scenario-based testing, which depends on the robust evaluation and generation of safety-critical scenarios. So far, researchers have used scenario-based testing frameworks that rely heavily on handcrafted scenarios as safety metrics. To reduce the effort of human interpretation and overcome the limited scalability of these approaches, we combine Large Language Models (LLMs) with structured scenario parsing and prompt engineering to automatically evaluate and generate safety-critical driving scenarios. We introduce Cartesian and Ego-centric prompt strategies for scenario evaluation, and an adversarial generation module that modifies trajectories of risk-inducing vehicles (ego-attackers) to create critical scenarios. We validate our approach using a 2D simulation framework and multiple pre-trained LLMs. The results show that the evaluation module effectively detects collision scenarios and infers scenario safety. Meanwhile, the new generation module identifies high-risk agents and synthesizes realistic, safety-critical scenarios. We conclude that an LLM equipped with domain-informed prompting techniques can effectively evaluate and generate safety-critical driving scenarios, reducing dependence on handcrafted metrics. We release our open-source code and scenarios at: https://github.com/TUM-AVS/From-Words-to-Collisions.

View on arXiv PDF Code

Similar