Predicting Next Local Appearance for Video Anomaly Detection
This addresses the problem of computationally expensive and less generalizable anomaly detection in videos for surveillance and security applications, though it is incremental as it builds on existing adversarial methods.
The paper tackles video anomaly detection by predicting the next local appearance of objects using an adversarial framework, achieving competitive state-of-the-art results with significantly faster training and inference times and better generalization to unseen scenes.
We present a local anomaly detection method in videos. As opposed to most existing methods that are computationally expensive and are not very generalizable across different video scenes, we propose an adversarial framework that learns the temporal local appearance variations by predicting the appearance of a normally behaving object in the next frame of a scene by only relying on its current and past appearances. In the presence of an abnormally behaving object, the reconstruction error between the real and the predicted next appearance of that object indicates the likelihood of an anomaly. Our method is competitive with the existing state-of-the-art while being significantly faster for both training and inference and being better at generalizing to unseen video scenes.