CVMar 19, 2022

No Shifted Augmentations (NSA): compact distributions for robust self-supervised Anomaly Detection

Mohamed Yousef, Marcel Ackermann, Unmesh Kurup, Tom Bishop

arXiv:2203.10344v15.74 citationsh-index: 13

Originality Incremental advance

AI Analysis

This work addresses the challenge of robust anomaly detection for applications like medical imaging or security, where training data may be contaminated, but it is incremental as it builds on existing self-supervised methods.

The paper tackles the problem of unsupervised anomaly detection in natural images by proposing architectural modifications to self-supervised feature learning to achieve compact in-distribution feature distributions, which improves outlier detection, especially when training data is polluted with out-of-distribution samples, resulting in state-of-the-art performance on benchmark datasets under pollution conditions.

Unsupervised Anomaly detection (AD) requires building a notion of normalcy, distinguishing in-distribution (ID) and out-of-distribution (OOD) data, using only available ID samples. Recently, large gains were made on this task for the domain of natural images using self-supervised contrastive feature learning as a first step followed by kNN or traditional one-class classifiers for feature scoring. Learned representations that are non-uniformly distributed on the unit hypersphere have been shown to be beneficial for this task. We go a step further and investigate how the \emph {geometrical compactness} of the ID feature distribution makes isolating and detecting outliers easier, especially in the realistic situation when ID training data is polluted (i.e. ID data contains some OOD data that is used for learning the feature extractor parameters). We propose novel architectural modifications to the self-supervised feature learning step, that enable such compact distributions for ID data to be learned. We show that the proposed modifications can be effectively applied to most existing self-supervised objectives, with large gains in performance. Furthermore, this improved OOD performance is obtained without resorting to tricks such as using strongly augmented ID images (e.g. by 90 degree rotations) as proxies for the unseen OOD data, as these impose overly prescriptive assumptions about ID data and its invariances. We perform extensive studies on benchmark datasets for one-class OOD detection and show state-of-the-art performance in the presence of pollution in the ID data, and comparable performance otherwise. We also propose and extensively evaluate a novel feature scoring technique based on the angular Mahalanobis distance, and propose a simple and novel technique for feature ensembling during evaluation that enables a big boost in performance at nearly zero run-time cost.

View on arXiv PDF

Similar