LGAIJan 25, 2024

Alleviating Structural Distribution Shift in Graph Anomaly Detection

arXiv:2401.14155v197 citationsHas CodeWSDM
Originality Incremental advance
AI Analysis

This work addresses generalization issues in graph anomaly detection for applications like fraud detection, but it is incremental as it builds on existing GNN methods with a novel feature-based approach.

The paper tackles the problem of structural distribution shift (SDS) in graph anomaly detection, where anomalies have varying heterophily and homophily across training and testing data, and proposes a Graph Decomposition Network (GDN) that constrains anomaly features to mitigate heterophilous neighbor effects, achieving a remarkable performance boost in SDS environments.

Graph anomaly detection (GAD) is a challenging binary classification problem due to its different structural distribution between anomalies and normal nodes -- abnormal nodes are a minority, therefore holding high heterophily and low homophily compared to normal nodes. Furthermore, due to various time factors and the annotation preferences of human experts, the heterophily and homophily can change across training and testing data, which is called structural distribution shift (SDS) in this paper. The mainstream methods are built on graph neural networks (GNNs), benefiting the classification of normals from aggregating homophilous neighbors, yet ignoring the SDS issue for anomalies and suffering from poor generalization. This work solves the problem from a feature view. We observe that the degree of SDS varies between anomalies and normal nodes. Hence to address the issue, the key lies in resisting high heterophily for anomalies meanwhile benefiting the learning of normals from homophily. We tease out the anomaly features on which we constrain to mitigate the effect of heterophilous neighbors and make them invariant. We term our proposed framework as Graph Decomposition Network (GDN). Extensive experiments are conducted on two benchmark datasets, and the proposed framework achieves a remarkable performance boost in GAD, especially in an SDS environment where anomalies have largely different structural distribution across training and testing environments. Codes are open-sourced in https://github.com/blacksingular/wsdm_GDN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes