CVAIAug 1, 2025

GV-VAD : Exploring Video Generation for Weakly-Supervised Video Anomaly Detection

arXiv:2508.00312v12 citationsh-index: 7Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of scaling video anomaly detection for public safety applications, but it is incremental as it builds on existing weakly-supervised methods with a novel data augmentation approach.

The paper tackles the challenge of limited and costly real-world anomaly data in video anomaly detection by proposing a framework that uses text-conditioned video generation to create synthetic videos for data augmentation, resulting in improved performance over state-of-the-art methods on UCF-Crime datasets.

Video anomaly detection (VAD) plays a critical role in public safety applications such as intelligent surveillance. However, the rarity, unpredictability, and high annotation cost of real-world anomalies make it difficult to scale VAD datasets, which limits the performance and generalization ability of existing models. To address this challenge, we propose a generative video-enhanced weakly-supervised video anomaly detection (GV-VAD) framework that leverages text-conditioned video generation models to produce semantically controllable and physically plausible synthetic videos. These virtual videos are used to augment training data at low cost. In addition, a synthetic sample loss scaling strategy is utilized to control the influence of generated synthetic samples for efficient training. The experiments show that the proposed framework outperforms state-of-the-art methods on UCF-Crime datasets. The code is available at https://github.com/Sumutan/GV-VAD.git.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes