A Comparison of Supervised and Unsupervised Deep Learning Methods for Anomaly Detection in Images
This work addresses the problem of automating anomaly detection in images for applications like quality assurance, but it is incremental as it compares existing methods on a standard dataset.
The paper compared supervised and unsupervised deep learning methods for anomaly detection in images, finding that unsupervised methods like KD-CAE performed better on the MVTec dataset, with KD-CAE outperforming CNN and NI-CAE in most cases, and NI-CAE achieving the best results on the Transistor dataset.
Anomaly detection in images plays a significant role for many applications across all industries, such as disease diagnosis in healthcare or quality assurance in manufacturing. Manual inspection of images, when extended over a monotonously repetitive period of time is very time consuming and can lead to anomalies being overlooked.Artificial neural networks have proven themselves very successful on simple, repetitive tasks, in some cases even outperforming humans. Therefore, in this paper we investigate different methods of deep learning, including supervised and unsupervised learning, for anomaly detection applied to a quality assurance use case. We utilize the MVTec anomaly dataset and develop three different models, a CNN for supervised anomaly detection, KD-CAE for autoencoder anomaly detection, NI-CAE for noise induced anomaly detection and a DCGAN for generating reconstructed images. By experiments, we found that KD-CAE performs better on the anomaly datasets compared to CNN and NI-CAE, with NI-CAE performing the best on the Transistor dataset. We also implemented a DCGAN for the creation of new training data but due to computational limitation and lack of extrapolating the mechanics of AnoGAN, we restricted ourselves just to the generation of GAN based images. We conclude that unsupervised methods are more powerful for anomaly detection in images, especially in a setting where only a small amount of anomalous data is available, or the data is unlabeled.