CVJun 26, 2023

Learning Prompt-Enhanced Context Features for Weakly-Supervised Video Anomaly Detection

Yujiang Pu, Xiaoyu Wu, Lulu Yang, Shengjin Wang

arXiv:2306.14451v220.1112 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of detecting anomalies in videos with weak supervision for applications like surveillance, offering incremental improvements in efficiency and sub-class accuracy.

The paper tackles weakly-supervised video anomaly detection by introducing a framework with a Temporal Context Aggregation module for efficient context modeling and a Prompt-Enhanced Learning module to enhance semantic discriminability, achieving competitive performance with reduced parameters and computational costs on benchmarks like UCF-Crime, XD-Violence, and ShanghaiTech.

Video anomaly detection under weak supervision presents significant challenges, particularly due to the lack of frame-level annotations during training. While prior research has utilized graph convolution networks and self-attention mechanisms alongside multiple instance learning (MIL)-based classification loss to model temporal relations and learn discriminative features, these methods often employ multi-branch architectures to capture local and global dependencies separately, resulting in increased parameters and computational costs. Moreover, the coarse-grained interclass separability provided by the binary constraint of MIL-based loss neglects the fine-grained discriminability within anomalous classes. In response, this paper introduces a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability. We present a Temporal Context Aggregation (TCA) module that captures comprehensive contextual information by reusing the similarity matrix and implementing adaptive fusion. Additionally, we propose a Prompt-Enhanced Learning (PEL) module that integrates semantic priors using knowledge-based prompts to boost the discriminative capacity of context features while ensuring separability between anomaly sub-classes. Extensive experiments validate the effectiveness of our method's components, demonstrating competitive performance with reduced parameters and computational effort on three challenging benchmarks: UCF-Crime, XD-Violence, and ShanghaiTech datasets. Notably, our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy. Our code is available at: https://github.com/yujiangpu20/PEL4VAD.

View on arXiv PDF Code

Similar