LGJun 11, 2021

Self-supervise, Refine, Repeat: Improving Unsupervised Anomaly Detection

Jinsung Yoon, Kihyuk Sohn, Chun-Liang Li, Sercan O. Arik, Chen-Yu Lee, Tomas Pfister

arXiv:2106.06115v213.131 citations

Originality Incremental advance

AI Analysis

It addresses the problem of detecting anomalies without labels for applications in domains like security and healthcare, representing an incremental improvement over existing methods.

The paper tackles fully unsupervised anomaly detection by improving one-class classification with a data refinement process using self-supervised representations, achieving performance gains such as a 6.3 AUC increase on CIFAR-10 and a 22.9 F1-score improvement on Thyroid data.

Anomaly detection (AD), separating anomalies from normal data, has many applications across domains, from security to healthcare. While most previous works were shown to be effective for cases with fully or partially labeled data, that setting is in practice less common due to labeling being particularly tedious for this task. In this paper, we focus on fully unsupervised AD, in which the entire training dataset, containing both normal and anomalous samples, is unlabeled. To tackle this problem effectively, we propose to improve the robustness of one-class classification trained on self-supervised representations using a data refinement process. Our proposed data refinement approach is based on an ensemble of one-class classifiers (OCCs), each of which is trained on a disjoint subset of training data. Representations learned by self-supervised learning on the refined data are iteratively updated as the data refinement improves. We demonstrate our method on various unsupervised AD tasks with image and tabular data. With a 10% anomaly ratio on CIFAR-10 image data / 2.5% anomaly ratio on Thyroid tabular data, the proposed method outperforms the state-of-the-art one-class classifier by 6.3 AUC and 12.5 average precision / 22.9 F1-score.

View on arXiv PDF

Similar