CVMar 2, 2025

Transformer Based Self-Context Aware Prediction for Few-Shot Anomaly Detection in Videos

Gargi V. Pillai, Ashish Verma, Debashis Sen

arXiv:2503.00670v13.612 citationsh-index: 5ICIP

Originality Incremental advance

AI Analysis

This addresses the problem of detecting diverse anomalies in videos with limited data, but it is incremental as it builds on existing transformer and few-shot learning approaches.

The paper tackles few-shot anomaly detection in videos by proposing a transformer-based method that learns from a few non-anomalous frames to predict subsequent frames and detect anomalies, demonstrating effectiveness with qualitative and quantitative results on standard datasets.

Anomaly detection in videos is a challenging task as anomalies in different videos are of different kinds. Therefore, a promising way to approach video anomaly detection is by learning the non-anomalous nature of the video at hand. To this end, we propose a one-class few-shot learning driven transformer based approach for anomaly detection in videos that is self-context aware. Features from the first few consecutive non-anomalous frames in a video are used to train the transformer in predicting the non-anomalous feature of the subsequent frame. This takes place under the attention of a self-context learned from the input features themselves. After the learning, given a few previous frames, the video-specific transformer is used to infer if a frame is anomalous or not by comparing the feature predicted by it with the actual. The effectiveness of the proposed method with respect to the state-of-the-art is demonstrated through qualitative and quantitative results on different standard datasets. We also study the positive effect of the self-context used in our approach.

View on arXiv PDF

Similar