LG AISep 21, 2022

Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation

Shuang Zhou, Xiao Huang, Ninghao Liu, Fu-Lai Chung, Long-Kai Huang

arXiv:2209.10168v24.638 citationsh-index: 33

Originality Incremental advance

AI Analysis

This addresses a critical issue for securing business applications where labels are scarce, though it appears incremental as it builds on existing methods with a novel augmentation approach.

The paper tackles the poor generalization of semi-supervised graph anomaly detection models to unseen graph areas, proposing a data augmentation method called AugAN that enriches training data to boost generalizability, with experiments verifying its effectiveness.

Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack labels to train an effective detection model. One natural idea is to directly adopt a trained GAD model to the new (sub)graph for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issue, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the same graph. It may cause great troubles. In this paper, we base on the phenomenon and propose a general and novel research problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph and unseen testing graph to eliminate potential dangers. Nevertheless, it is a challenging task since only limited labels are available, and the normal background may differ between training and testing data. Accordingly, we propose a data augmentation method named \textit{AugAN} (\uline{Aug}mentation for \uline{A}nomaly and \uline{N}ormal distributions) to enrich training data and boost the generalizability of GAD models. Experiments verify the effectiveness of our method in improving model generalizability.

View on arXiv PDF

Similar