LGMLOct 28, 2021

OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary

arXiv:2110.15273v26 citations
Originality Incremental advance
AI Analysis

This addresses the rarity of anomalies in anomaly detection for image data, though it is incremental as it builds on existing GAN and self-supervised learning methods.

The paper tackles the problem of generative models assigning high likelihood to out-of-distribution samples, which reduces anomaly detection performance, and proposes OMASGAN to generate anomalous samples on the distribution boundary for negative data augmentation, achieving improvements of at least 0.24 and 0.07 AUROC points on MNIST and CIFAR-10 datasets.

Generative models trained in an unsupervised manner may set high likelihood and low reconstruction loss to Out-of-Distribution (OoD) samples. This increases Type II errors and leads to missed anomalies, overall decreasing Anomaly Detection (AD) performance. In addition, AD models underperform due to the rarity of anomalies. To address these limitations, we propose the OoD Minimum Anomaly Score GAN (OMASGAN). OMASGAN generates, in a negative data augmentation manner, anomalous samples on the estimated distribution boundary. These samples are then used to refine an AD model, leading to more accurate estimation of the underlying data distribution including multimodal supports with disconnected modes. OMASGAN performs retraining by including the abnormal minimum-anomaly-score OoD samples generated on the distribution boundary in a self-supervised learning manner. For inference, for AD, we devise a discriminator which is trained with negative and positive samples either generated (negative or positive) or real (only positive). OMASGAN addresses the rarity of anomalies by generating strong and adversarial OoD samples on the distribution boundary using only normal class data, effectively addressing mode collapse. A key characteristic of our model is that it uses any f-divergence distribution metric in its variational representation, not requiring invertibility. OMASGAN does not use feature engineering and makes no assumptions about the data distribution. The evaluation of OMASGAN on image data using the leave-one-out methodology shows that it achieves an improvement of at least 0.24 and 0.07 points in AUROC on average on the MNIST and CIFAR-10 datasets, respectively, over other benchmark and state-of-the-art models for AD.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes