Synthetic Pseudo Anomalies for Unsupervised Video Anomaly Detection: A Simple yet Efficient Framework based on Masked Autoencoder
This work addresses the challenge of limited anomalous samples in video anomaly detection, which is an incremental improvement over existing autoencoder-based methods.
The paper tackles the problem of video anomaly detection by addressing the issue where autoencoders reconstruct anomalies well even when trained only on normal data, leading to poor detection performance. The authors propose a framework that synthesizes pseudo anomalies from normal data using random mask tokens and a normalcy consistency training strategy, resulting in superior anomaly discrimination capability.
Due to the limited availability of anomalous samples for training, video anomaly detection is commonly viewed as a one-class classification problem. Many prevalent methods investigate the reconstruction difference produced by AutoEncoders (AEs) under the assumption that the AEs would reconstruct the normal data well while reconstructing anomalies poorly. However, even with only normal data training, the AEs often reconstruct anomalies well, which depletes their anomaly detection performance. To alleviate this issue, we propose a simple yet efficient framework for video anomaly detection. The pseudo anomaly samples are introduced, which are synthesized from only normal data by embedding random mask tokens without extra data processing. We also propose a normalcy consistency training strategy that encourages the AEs to better learn the regular knowledge from normal and corresponding pseudo anomaly data. This way, the AEs learn more distinct reconstruction boundaries between normal and abnormal data, resulting in superior anomaly discrimination capability. Experimental results demonstrate the effectiveness of the proposed method.