CVMar 2, 2024

Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection

arXiv:2403.01169v27 citationsh-index: 8Has Code
Originality Highly original
AI Analysis

This work addresses the problem of ambiguous anomaly definitions in video surveillance for researchers and practitioners, offering a novel approach that improves accuracy and generalizability.

The paper tackles weakly supervised video anomaly detection by using textual event prompts to guide learning of suspected anomalies, achieving state-of-the-art performance with AP or AUC scores of 86.5%, 90.4%, 94.4%, and 97.4% on four datasets.

Most models for weakly supervised video anomaly detection (WS-VAD) rely on multiple instance learning, aiming to distinguish normal and abnormal snippets without specifying the type of anomaly. However, the ambiguous nature of anomaly definitions across contexts may introduce inaccuracy in discriminating abnormal and normal events. To show the model what is anomalous, a novel framework is proposed to guide the learning of suspected anomalies from event prompts. Given a textual prompt dictionary of potential anomaly events and the captions generated from anomaly videos, the semantic anomaly similarity between them could be calculated to identify the suspected events for each video snippet. It enables a new multi-prompt learning process to constrain the visual-semantic features across all videos, as well as provides a new way to label pseudo anomalies for self-training. To demonstrate its effectiveness, comprehensive experiments and detailed ablation studies are conducted on four datasets, namely XD-Violence, UCF-Crime, TAD, and ShanghaiTech. Our proposed model outperforms most state-of-the-art methods in terms of AP or AUC (86.5\%, \hl{90.4}\%, 94.4\%, and 97.4\%). Furthermore, it shows promising performance in open-set and cross-dataset cases. The data, code, and models can be found at: \url{https://github.com/shiwoaz/lap}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes