LG AIJul 8, 2024

Balanced Edge Pruning for Graph Anomaly Detection with Noisy Labels

Zhu Wang, Junnan Dong, Shuang Zhou, Chang Yang, Shengjie Zhao, Xiao Huang

arXiv:2407.05934v24.62 citationsh-index: 12

Originality Incremental advance

AI Analysis

This work addresses the challenge of noisy labels in graph anomaly detection, which is critical for applications like fraud detection, but it is incremental as it builds on existing edge-pruning methods with reinforcement learning enhancements.

The paper tackles the problem of graph anomaly detection with noisy labels by proposing a reinforced graph anomaly detector (REGAD) that prunes edges to mitigate noise propagation, achieving superior performance over baselines across three real-world datasets under varying noise ratios.

Graph anomaly detection (GAD) is widely applied in many areas, such as financial fraud detection and social spammer detection. Anomalous nodes in the graph not only impact their own communities but also create a ripple effect on neighbors throughout the graph structure. Detecting anomalous nodes in complex graphs has been a challenging task. While existing GAD methods assume all labels are correct, real-world scenarios often involve inaccurate annotations. These noisy labels can severely degrade GAD performance because, with anomalies representing a minority class, even a small number of mislabeled instances can disproportionately interfere with detection models. Cutting edges to mitigate the negative effects of noisy labels is a good option; however, it has both positive and negative influences and also presents an issue of weak supervision. To perform effective GAD with noisy labels, we propose REinforced Graph Anomaly Detector (REGAD) by pruning the edges of candidate nodes potentially with mistaken labels. Moreover, we design the performance feedback based on strategically crafted confident labels to guide the cutting process, ensuring optimal results. Specifically, REGAD contains two novel components. (i) A tailored policy network, which involves two-step actions to remove negative effect propagation step by step. (ii) A policy-in-the-loop mechanism to identify suitable edge removal strategies that control the propagation of noise on the graph and estimate the updated structure to obtain reliable pseudo labels iteratively. Experiments on three real-world datasets demonstrate that REGAD outperforms all baselines under different noisy ratios.

View on arXiv PDF

Similar