LG AIJun 18, 2023

Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation

Shuang Zhou, Xiao Huang, Ninghao Liu, Huachi Zhou, Fu-Lai Chung, Long-Kai Huang

arXiv:2306.10534v114.940 citationsh-index: 33

Originality Incremental advance

AI Analysis

This addresses a critical issue for practitioners needing to secure business operations by detecting anomalies on new graphs without labels, though it is incremental as it builds on existing methods.

The paper tackles the poor generalization of semi-supervised graph anomaly detection models to unseen graph areas by proposing a data augmentation method called AugAN, which enriches training data and improves model generalizability as verified in experiments.

Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack labels to train an effective detection model. One natural idea is to directly adopt a trained GAD model to the new (sub)graph for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issue, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the same graph. It may cause great troubles. In this paper, we base on the phenomenon and propose a general and novel research problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph and unseen testing graph to eliminate potential dangers. Nevertheless, it is a challenging task since only limited labels are available, and the normal background may differ between training and testing data. Accordingly, we propose a data augmentation method named \textit{AugAN} (\uline{Aug}mentation for \uline{A}nomaly and \uline{N}ormal distributions) to enrich training data and boost the generalizability of GAD models. Experiments verify the effectiveness of our method in improving model generalizability.

View on arXiv PDF

Similar