LG MLOct 28, 2021

OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary

Nikolaos Dionelis, Mehrdad Yaghoobi, Sotirios A. Tsaftaris

arXiv:2110.15273v24.46 citationsh-index: 46Has Code

Originality Incremental advance

AI Analysis

This addresses the rarity of anomalies in anomaly detection for image data, though it is incremental as it builds on existing GAN and self-supervised learning methods.

The paper tackles the problem of generative models assigning high likelihood to out-of-distribution samples, which reduces anomaly detection performance, and proposes OMASGAN to generate anomalous samples on the distribution boundary for negative data augmentation, achieving improvements of at least 0.24 and 0.07 AUROC points on MNIST and CIFAR-10 datasets.

Generative models trained in an unsupervised manner may set high likelihood and low reconstruction loss to Out-of-Distribution (OoD) samples. This increases Type II errors and leads to missed anomalies, overall decreasing Anomaly Detection (AD) performance. In addition, AD models underperform due to the rarity of anomalies. To address these limitations, we propose the OoD Minimum Anomaly Score GAN (OMASGAN). OMASGAN generates, in a negative data augmentation manner, anomalous samples on the estimated distribution boundary. These samples are then used to refine an AD model, leading to more accurate estimation of the underlying data distribution including multimodal supports with disconnected modes. OMASGAN performs retraining by including the abnormal minimum-anomaly-score OoD samples generated on the distribution boundary in a self-supervised learning manner. For inference, for AD, we devise a discriminator which is trained with negative and positive samples either generated (negative or positive) or real (only positive). OMASGAN addresses the rarity of anomalies by generating strong and adversarial OoD samples on the distribution boundary using only normal class data, effectively addressing mode collapse. A key characteristic of our model is that it uses any f-divergence distribution metric in its variational representation, not requiring invertibility. OMASGAN does not use feature engineering and makes no assumptions about the data distribution. The evaluation of OMASGAN on image data using the leave-one-out methodology shows that it achieves an improvement of at least 0.24 and 0.07 points in AUROC on average on the MNIST and CIFAR-10 datasets, respectively, over other benchmark and state-of-the-art models for AD.

View on arXiv PDF Code

Similar