A Survey on Video Anomaly Detection via Deep Learning: Human, Vehicle, and Environment
It addresses the need for a comprehensive reference to support researchers in computer vision, though it is incremental as a survey consolidating existing work.
This survey tackles the fragmented state of video anomaly detection (VAD) by systematically organizing literature across supervision levels, adaptive learning methods, and three application categories (human, vehicle, environment), aiming to provide a structured foundation for advancing theoretical and practical aspects of VAD systems.
Video Anomaly Detection (VAD) has emerged as a pivotal task in computer vision, with broad relevance across multiple fields. Recent advances in deep learning have driven significant progress in this area, yet the field remains fragmented across domains and learning paradigms. This survey offers a comprehensive perspective on VAD, systematically organizing the literature across various supervision levels, as well as adaptive learning methods such as online, active, and continual learning. We examine the state of VAD across three major application categories: human-centric, vehicle-centric, and environment-centric scenarios, each with distinct challenges and design considerations. In doing so, we identify fundamental contributions and limitations of current methodologies. By consolidating insights from subfields, we aim to provide the community with a structured foundation for advancing both theoretical understanding and real-world applicability of VAD systems. This survey aims to support researchers by providing a useful reference, while also drawing attention to the broader set of open challenges in anomaly detection, including both fundamental research questions and practical obstacles to real-world deployment.