CVAIJul 2, 2025

Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges

arXiv:2507.02074v24 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

It addresses the problem of crash detection for intelligent transportation systems, but it is incremental as it is a survey paper.

This paper surveys recent methods that use large language models for crash detection in video, summarizing fusion strategies, datasets, and performance benchmarks to provide a foundation for future research.

Crash detection from video feeds is a critical problem in intelligent transportation systems. Recent developments in large language models (LLMs) and vision-language models (VLMs) have transformed how we process, reason about, and summarize multimodal information. This paper surveys recent methods leveraging LLMs for crash detection from video data. We present a structured taxonomy of fusion strategies, summarize key datasets, analyze model architectures, compare performance benchmarks, and discuss ongoing challenges and opportunities. Our review provides a foundation for future research in this fast-growing intersection of video understanding and foundation models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes