Mind the Gap: Evaluating LLM Understanding of Human-Taught Road Safety Principles
This work addresses the problem of ensuring AI systems in autonomous vehicles can interpret road safety principles, but it is incremental as it focuses on preliminary evaluation and analysis of existing gaps.
The study evaluated how well multi-modal large language models understand road safety concepts using a dataset of images from school textbooks, finding that the models struggled with safety reasoning and revealed gaps compared to human learning.
Following road safety norms is non-negotiable not only for humans but also for the AI systems that govern autonomous vehicles. In this work, we evaluate how well multi-modal large language models (LLMs) understand road safety concepts, specifically through schematic and illustrative representations. We curate a pilot dataset of images depicting traffic signs and road-safety norms sourced from school text books and use it to evaluate models capabilities in a zero-shot setting. Our preliminary results show that these models struggle with safety reasoning and reveal gaps between human learning and model interpretation. We further provide an analysis of these performance gaps for future research.