CVAIMar 18, 2024

Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection

arXiv:2403.12172v218 citationsh-index: 5WACV
Originality Incremental advance
AI Analysis

It addresses the problem of detecting suspicious activities in videos for safety applications, but appears incremental as it builds on existing diffusion and graph methods.

The paper tackles skeleton-based video anomaly detection by introducing GiCiSAD, a framework that uses graph attention, jigsaw puzzles, and conditional diffusion to capture spatio-temporal dependencies and region-level discrepancies, achieving state-of-the-art results on four datasets with fewer parameters.

Skeleton-based video anomaly detection (SVAD) is a crucial task in computer vision. Accurately identifying abnormal patterns or events enables operators to promptly detect suspicious activities, thereby enhancing safety. Achieving this demands a comprehensive understanding of human motions, both at body and region levels, while also accounting for the wide variations of performing a single action. However, existing studies fail to simultaneously address these crucial properties. This paper introduces a novel, practical and lightweight framework, namely Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection (GiCiSAD) to overcome the challenges associated with SVAD. GiCiSAD consists of three novel modules: the Graph Attention-based Forecasting module to capture the spatio-temporal dependencies inherent in the data, the Graph-level Jigsaw Puzzle Maker module to distinguish subtle region-level discrepancies between normal and abnormal motions, and the Graph-based Conditional Diffusion model to generate a wide spectrum of human motions. Extensive experiments on four widely used skeleton-based video datasets show that GiCiSAD outperforms existing methods with significantly fewer training parameters, establishing it as the new state-of-the-art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes