CVJun 5, 2024

Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection

Jash Dalvi, Ali Dabouei, Gunjan Dhanuka, Min Xu

arXiv:2406.02831v23.71 citations

Originality Incremental advance

AI Analysis

It addresses the problem of detecting anomalies in surveillance videos for security applications, but it is incremental as it builds on existing distillation and aggregation methods.

The paper tackles video anomaly detection under weak supervision by distilling knowledge from aggregated representations into a single model, achieving state-of-the-art performance with improvements of 1.36%, 0.78%, and 7.02% on benchmark datasets.

Video anomaly detection aims to develop automated models capable of identifying abnormal events in surveillance videos. The benchmark setup for this task is extremely challenging due to: i) the limited size of the training sets, ii) weak supervision provided in terms of video-level labels, and iii) intrinsic class imbalance induced by the scarcity of abnormal events. In this work, we show that distilling knowledge from aggregated representations of multiple backbones into a single-backbone Student model achieves state-of-the-art performance. In particular, we develop a bi-level distillation approach along with a novel disentangled cross-attention-based feature aggregation network. Our proposed approach, DAKD (Distilling Aggregated Knowledge with Disentangled Attention), demonstrates superior performance compared to existing methods across multiple benchmark datasets. Notably, we achieve significant improvements of 1.36%, 0.78%, and 7.02% on the UCF-Crime, ShanghaiTech, and XD-Violence datasets, respectively.

View on arXiv PDF

Similar