LGAug 1, 2025

Text-Attributed Graph Anomaly Detection via Multi-Scale Cross- and Uni-Modal Contrastive Learning

Yiming Xu, Xu Hua, Zhen Peng, Bin Shi, Jiarun Chen, Xingbo Fu, Song Wang, Bo Dong

arXiv:2508.00513v19.43 citationsh-index: 7ECAI

Originality Highly original

AI Analysis

This addresses anomaly detection in high-risk scenarios using text-attributed graphs, offering a novel integration approach that is not incremental but introduces a new paradigm.

The paper tackles the problem of detecting anomalies in text-attributed graphs by proposing CMUCL, an end-to-end method that integrates raw text and graph topology through multi-scale contrastive learning, resulting in an 11.13% average accuracy improvement over the suboptimal method.

The widespread application of graph data in various high-risk scenarios has increased attention to graph anomaly detection (GAD). Faced with real-world graphs that often carry node descriptions in the form of raw text sequences, termed text-attributed graphs (TAGs), existing graph anomaly detection pipelines typically involve shallow embedding techniques to encode such textual information into features, and then rely on complex self-supervised tasks within the graph domain to detect anomalies. However, this text encoding process is separated from the anomaly detection training objective in the graph domain, making it difficult to ensure that the extracted textual features focus on GAD-relevant information, seriously constraining the detection capability. How to seamlessly integrate raw text and graph topology to unleash the vast potential of cross-modal data in TAGs for anomaly detection poses a challenging issue. This paper presents a novel end-to-end paradigm for text-attributed graph anomaly detection, named CMUCL. We simultaneously model data from both text and graph structures, and jointly train text and graph encoders by leveraging cross-modal and uni-modal multi-scale consistency to uncover potential anomaly-related information. Accordingly, we design an anomaly score estimator based on inconsistency mining to derive node-specific anomaly scores. Considering the lack of benchmark datasets tailored for anomaly detection on TAGs, we release 8 datasets to facilitate future research. Extensive evaluations show that CMUCL significantly advances in text-attributed graph anomaly detection, delivering an 11.13% increase in average accuracy (AP) over the suboptimal.

View on arXiv PDF

Similar