CVMar 2, 2025

Transformer Based Self-Context Aware Prediction for Few-Shot Anomaly Detection in Videos

arXiv:2503.00670v112 citationsh-index: 5ICIP
Originality Incremental advance
AI Analysis

This addresses the problem of detecting diverse anomalies in videos with limited data, but it is incremental as it builds on existing transformer and few-shot learning approaches.

The paper tackles few-shot anomaly detection in videos by proposing a transformer-based method that learns from a few non-anomalous frames to predict subsequent frames and detect anomalies, demonstrating effectiveness with qualitative and quantitative results on standard datasets.

Anomaly detection in videos is a challenging task as anomalies in different videos are of different kinds. Therefore, a promising way to approach video anomaly detection is by learning the non-anomalous nature of the video at hand. To this end, we propose a one-class few-shot learning driven transformer based approach for anomaly detection in videos that is self-context aware. Features from the first few consecutive non-anomalous frames in a video are used to train the transformer in predicting the non-anomalous feature of the subsequent frame. This takes place under the attention of a self-context learned from the input features themselves. After the learning, given a few previous frames, the video-specific transformer is used to infer if a frame is anomalous or not by comparing the feature predicted by it with the actual. The effectiveness of the proposed method with respect to the state-of-the-art is demonstrated through qualitative and quantitative results on different standard datasets. We also study the positive effect of the self-context used in our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes