Chuan Zhou

LG
h-index13
40papers
3,648citations
Novelty44%
AI Score58

40 Papers

LGMay 31, 2022Code
Graph-level Neural Networks: Current Progress and Future Directions

Ge Zhang, Jia Wu, Jian Yang et al. · allen-ai

Graph-structured data consisting of objects (i.e., nodes) and relationships among objects (i.e., edges) are ubiquitous. Graph-level learning is a matter of studying a collection of graphs instead of a single graph. Traditional graph-level learning methods used to be the mainstream. However, with the increasing scale and complexity of graphs, Graph-level Neural Networks (GLNNs, deep learning-based graph-level learning methods) have been attractive due to their superiority in modeling high-dimensional data. Thus, a survey on GLNNs is necessary. To frame this survey, we propose a systematic taxonomy covering GLNNs upon deep neural networks, graph neural networks, and graph pooling. The representative and state-of-the-art models in each category are focused on this survey. We also investigate the reproducibility, benchmarks, and new graph datasets of GLNNs. Finally, we conclude future directions to further push forward GLNNs. The repository of this survey is available at https://github.com/GeZhangMQ/Awesome-Graph-level-Neural-Networks.

LGMay 29
AbstainGNN: Teaching Graph Neural Networks to Abstain for Graph Classification

Xixun Lin, Zhiheng Zhou, Zhengyin Zhang et al.

Graph classification is a core task in graph data mining with widespread real-world applications. Recent advances in graph neural networks (GNNs) have led to substantial performance improvements for graph classification. However, existing GNNs are typically forced to make predictions even under high uncertainty or unknown conditions, resulting in unreliable decisions that can severely impact downstream tasks, particularly in safety-critical scenarios. To address this critical limitation, we propose AbstainGNN, a novel and theory-driven framework for graph classification with abstention, which enables GNNs to reject uncertain predictions instead of producing incorrect decisions. Specifically, AbstainGNN explicitly models both the predictive function and the abstention function, allowing for effective utilization of graph structural information. Moreover, unlike existing heuristic abstention methods, we theoretically characterize the trade-off between classification errors and rejection costs from a PAC-Bayesian generalization perspective, and derive a unified learning objective for model optimization. Guided by this theoretical insight, we further develop an efficient two-stage training strategy consisting of predictive function warm-start and abstention function calibration. Extensive experiments on five benchmark datasets show that AbstainGNN outperforms existing abstention methods, achieving superior classification performance under the same rejection rates.

LGOct 18, 2022
DAGAD: Data Augmentation for Graph Anomaly Detection

Fanzhen Liu, Xiaoxiao Ma, Jia Wu et al.

Graph anomaly detection in this paper aims to distinguish abnormal nodes that behave differently from the benign ones accounting for the majority of graph-structured instances. Receiving increasing attention from both academia and industry, yet existing research on this task still suffers from two critical issues when learning informative anomalous behavior from graph data. For one thing, anomalies are usually hard to capture because of their subtle abnormal behavior and the shortage of background knowledge about them, which causes severe anomalous sample scarcity. Meanwhile, the overwhelming majority of objects in real-world graphs are normal, bringing the class imbalance problem as well. To bridge the gaps, this paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs, equipped with three specially designed modules: 1) an information fusion module employing graph neural network encoders to learn representations, 2) a graph data augmentation module that fertilizes the training set with generated samples, and 3) an imbalance-tailored learning module to discriminate the distributions of the minority (anomalous) and majority (normal) classes. A series of experiments on three datasets prove that DAGAD outperforms ten state-of-the-art baseline detectors concerning various mostly-used metrics, together with an extensive ablation study validating the strength of our proposed modules.

LGJun 1, 2022
Ultrahyperbolic Knowledge Graph Embeddings

Bo Xiong, Shichao Zhu, Mojtaba Nayyeri et al.

Recent knowledge graph (KG) embeddings have been advanced by hyperbolic geometry due to its superior capability for representing hierarchies. The topological structures of real-world KGs, however, are rather heterogeneous, i.e., a KG is composed of multiple distinct hierarchies and non-hierarchical graph structures. Therefore, a homogeneous (either Euclidean or hyperbolic) geometry is not sufficient for fairly representing such heterogeneous structures. To capture the topological heterogeneity of KGs, we present an ultrahyperbolic KG embedding (UltraE) in an ultrahyperbolic (or pseudo-Riemannian) manifold that seamlessly interleaves hyperbolic and spherical manifolds. In particular, we model each relation as a pseudo-orthogonal transformation that preserves the pseudo-Riemannian bilinear form. The pseudo-orthogonal transformation is decomposed into various operators (i.e., circular rotations, reflections and hyperbolic rotations), allowing for simultaneously modeling heterogeneous structures as well as complex relational patterns. Experimental results on three standard KGs show that UltraE outperforms previous Euclidean- and hyperbolic-based approaches.

LGJan 14, 2023
State of the Art and Potentialities of Graph-level Learning

Zhenyu Yang, Ge Zhang, Jia Wu et al.

Graphs have a superior ability to represent relational data, like chemical compounds, proteins, and social networks. Hence, graph-level learning, which takes a set of graphs as input, has been applied to many tasks including comparison, regression, classification, and more. Traditional approaches to learning a set of graphs heavily rely on hand-crafted features, such as substructures. But while these methods benefit from good interpretability, they often suffer from computational bottlenecks as they cannot skirt the graph isomorphism problem. Conversely, deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations. As a result, these deep graph learning methods have been responsible for many successes. Yet, there is no comprehensive survey that reviews graph-level learning starting with traditional learning and moving through to the deep learning approaches. This article fills this gap and frames the representative algorithms into a systematic taxonomy covering traditional learning, graph-level deep neural networks, graph-level graph neural networks, and graph pooling. To ensure a thoroughly comprehensive survey, the evolutions, interactions, and communications between methods from four different branches of development are also examined. This is followed by a brief review of the benchmark data sets, evaluation metrics, and common downstream applications. The survey concludes with a broad overview of 12 current and future directions in this booming field.

LGJun 2
Message Tuning Outshines Graph Prompt Tuning: A Prismatic Space Perspective

Yancheng Chen, Dun Ma, Shuai Zhang et al.

Graph Foundation Models (GFMs), built upon the Pre-training and Adaptation paradigm, have emerged as a research hotspot in graph learning. For GNN-based GFMs, graph prompt tuning has become the prevailing adaptation method for downstream tasks. Although recent methods explain why graph prompt tuning works, how to rigorously measure its adaptation capacity remains an open problem. Addressing this problem is critical for understanding the capability limits of graph prompt tuning and for developing more powerful adaptation methods. In this paper, we propose Prismatic Space Theory (PS-Theory), a novel mathematical framework to quantify the capacity of adaptation methods, while focusing on establishing the upper bound for the adaptation capacity of graph prompt tuning. Building upon the proposed PS-Theory, we further introduce Message Tuning for GFMs (MTG), a lightweight approach that injects a small set of learnable message prototypes into each layer of the GNN backbone to adaptively guide message fusion without updating pre-trained weights. Through our PS-Theory, we prove that the adaptation capacity of MTG can exceed the theoretical upper bound of graph prompt tuning. Extensive experiments demonstrate that MTG consistently outperforms graph prompt baselines across diverse benchmark datasets, providing strong empirical support for our theoretical findings.

LGFeb 23, 2023
Auto-HeG: Automated Graph Neural Network on Heterophilic Graphs

Xin Zheng, Miao Zhang, Chunyang Chen et al.

Graph neural architecture search (NAS) has gained popularity in automatically designing powerful graph neural networks (GNNs) with relieving human efforts. However, existing graph NAS methods mainly work under the homophily assumption and overlook another important graph property, i.e., heterophily, which exists widely in various real-world applications. To date, automated heterophilic graph learning with NAS is still a research blank to be filled in. Due to the complexity and variety of heterophilic graphs, the critical challenge of heterophilic graph NAS mainly lies in developing the heterophily-specific search space and strategy. Therefore, in this paper, we propose a novel automated graph neural network on heterophilic graphs, namely Auto-HeG, to automatically build heterophilic GNN models with expressive learning abilities. Specifically, Auto-HeG incorporates heterophily into all stages of automatic heterophilic graph learning, including search space design, supernet training, and architecture selection. Through the diverse message-passing scheme with joint micro-level and macro-level designs, we first build a comprehensive heterophilic GNN search space, enabling Auto-HeG to integrate complex and various heterophily of graphs. With a progressive supernet training strategy, we dynamically shrink the initial search space according to layer-wise variation of heterophily, resulting in a compact and efficient supernet. Taking a heterophily-aware distance criterion as the guidance, we conduct heterophilic architecture selection in the leave-one-out pattern, so that specialized and expressive heterophilic GNN architectures can be derived. Extensive experiments illustrate the superiority of Auto-HeG in developing excellent heterophilic GNNs to human-designed models and graph NAS models.

CEJun 1
Beyond Pairwise Interactions: Equivariant Hypergraph Diffusion for Crystal Structure Prediction

Yang Liu, Chuan Zhou, Shuai Zhang et al.

Crystal Structure Prediction (CSP) remains a fundamental challenge with significant implications for materials discovery and the advancement of various scientific disciplines. Recent advances have demonstrated that generative models, particularly diffusion models, are especially promising for CSP. However, traditional graph-based representations, where atomic bonds are modeled as pairwise graph edges, fail to capture the intricate high-order interactions essential for accurately describing crystal structures. To address this limitation, we propose leveraging hypergraphs to represent crystal structures, enabling more expressive modeling of multi-way atomic interactions. Hypergraphs naturally encode complex high-order relationships and respect key symmetries -- such as permutation and periodic translation invariance -- that are crucial for characterizing crystalline materials. Building on this representation, we propose the \textbf{E}quivariant \textbf{H}ypergraph \textbf{Diff}usion Model (\textbf{EH-Diff}), a generative framework designed to exploit the symmetry-preserving properties of hypergraphs. EH-Diff provides an efficient and accurate method for predicting crystal structures, with rigorous theoretical guarantees on invariance preservation. Empirically, we conduct extensive experiments on four benchmark datasets, and the results demonstrate that EH-Diff outperforms state-of-the-art CSP methods even with a single diffusion sample.

LGJun 25, 2022
Geometry Contrastive Learning on Heterogeneous Graphs

Shichao Zhu, Chuan Zhou, Anfeng Cheng et al.

Self-supervised learning (especially contrastive learning) methods on heterogeneous graphs can effectively get rid of the dependence on supervisory data. Meanwhile, most existing representation learning methods embed the heterogeneous graphs into a single geometric space, either Euclidean or hyperbolic. This kind of single geometric view is usually not enough to observe the complete picture of heterogeneous graphs due to their rich semantics and complex structures. Under these observations, this paper proposes a novel self-supervised learning method, termed as Geometry Contrastive Learning (GCL), to better represent the heterogeneous graphs when supervisory data is unavailable. GCL views a heterogeneous graph from Euclidean and hyperbolic perspective simultaneously, aiming to make a strong merger of the ability of modeling rich semantics and complex structures, which is expected to bring in more benefits for downstream tasks. GCL maximizes the mutual information between two geometric views by contrasting representations at both local-local and local-global semantic levels. Extensive experiments on four benchmarks data sets show that the proposed approach outperforms the strong baselines, including both unsupervised methods and supervised methods, on three tasks, including node classification, node clustering and similarity search.

LGOct 23, 2023
GNNEvaluator: Evaluating GNN Performance On Unseen Graphs Without Labels

Xin Zheng, Miao Zhang, Chunyang Chen et al.

Evaluating the performance of graph neural networks (GNNs) is an essential task for practical GNN model deployment and serving, as deployed GNNs face significant performance uncertainty when inferring on unseen and unlabeled test graphs, due to mismatched training-test graph distributions. In this paper, we study a new problem, GNN model evaluation, that aims to assess the performance of a specific GNN model trained on labeled and observed graphs, by precisely estimating its performance (e.g., node classification accuracy) on unseen graphs without labels. Concretely, we propose a two-stage GNN model evaluation framework, including (1) DiscGraph set construction and (2) GNNEvaluator training and inference. The DiscGraph set captures wide-range and diverse graph data distribution discrepancies through a discrepancy measurement function, which exploits the outputs of GNNs related to latent node embeddings and node class predictions. Under the effective training supervision from the DiscGraph set, GNNEvaluator learns to precisely estimate node classification accuracy of the to-be-evaluated GNN model and makes an accurate inference for evaluating GNN model performance. Extensive experiments on real-world unseen and unlabeled test graphs demonstrate the effectiveness of our proposed method for GNN model evaluation.

LGApr 20, 2023
ID-MixGCL: Identity Mixup for Graph Contrastive Learning

Gehang Zhang, Bowen Yu, Jiangxia Cao et al.

Graph contrastive learning (GCL) has recently achieved substantial advancements. Existing GCL approaches compare two different ``views'' of the same graph in order to learn node/graph representations. The underlying assumption of these studies is that the graph augmentation strategy is capable of generating several different graph views such that the graph views are structurally different but semantically similar to the original graphs, and thus the ground-truth labels of the original and augmented graph/nodes can be regarded identical in contrastive learning. However, we observe that this assumption does not always hold. For instance, the deletion of a super-node within a social network can exert a substantial influence on the partitioning of communities for other nodes. Similarly, any perturbation to nodes or edges in a molecular graph will change the labels of the graph. Therefore, we believe that augmenting the graph, accompanied by an adaptation of the labels used for the contrastive loss, will facilitate the encoder to learn a better representation. Based on this idea, we propose ID-MixGCL, which allows the simultaneous interpolation of input nodes and corresponding identity labels to obtain soft-confidence samples, with a controllable degree of change, leading to the capture of fine-grained representations from self-supervised training on unlabeled graphs. Experimental results demonstrate that ID-MixGCL improves performance on graph classification and node classification tasks, as demonstrated by significant improvements on the Cora, IMDB-B, IMDB-M, and PROTEINS datasets compared to state-of-the-art techniques, by 3-29% absolute points.

LGJul 23, 2024
Transformer-based Graph Neural Networks for Battery Range Prediction in AIoT Battery-Swap Services

Zhao Li, Yang Liu, Chuan Zhou et al.

The concept of the sharing economy has gained broad recognition, and within this context, Sharing E-Bike Battery (SEB) have emerged as a focal point of societal interest. Despite the popularity, a notable discrepancy remains between user expectations regarding the remaining battery range of SEBs and the reality, leading to a pronounced inclination among users to find an available SEB during emergency situations. In response to this challenge, the integration of Artificial Intelligence of Things (AIoT) and battery-swap services has surfaced as a viable solution. In this paper, we propose a novel structural Transformer-based model, referred to as the SEB-Transformer, designed specifically for predicting the battery range of SEBs. The scenario is conceptualized as a dynamic heterogeneous graph that encapsulates the interactions between users and bicycles, providing a comprehensive framework for analysis. Furthermore, we incorporate the graph structure into the SEB-Transformer to facilitate the estimation of the remaining e-bike battery range, in conjunction with mean structural similarity, enhancing the prediction accuracy. By employing the predictions made by our model, we are able to dynamically adjust the optimal cycling routes for users in real-time, while also considering the strategic locations of charging stations, thereby optimizing the user experience. Empirically our results on real-world datasets demonstrate the superiority of our model against nine competitive baselines. These innovations, powered by AIoT, not only bridge the gap between user expectations and the physical limitations of battery range but also significantly improve the operational efficiency and sustainability of SEB services. Through these advancements, the shared electric bicycle ecosystem is evolving, making strides towards a more reliable, user-friendly, and sustainable mode of transportation.

AIApr 14
CIA: Inferring the Communication Topology from LLM-based Multi-Agent Systems

Yongxuan Wu, Xixun Lin, He Zhang et al.

LLM-based Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in solving complex tasks. Central to MAS is the communication topology which governs how agents exchange information internally. Consequently, the security of communication topologies has attracted increasing attention. In this paper, we investigate a critical privacy risk: MAS communication topologies can be inferred under a restrictive black-box setting, exposing system vulnerabilities and posing significant intellectual property threats. To explore this risk, we propose Communication Inference Attack (CIA), a novel attack that constructs new adversarial queries to induce intermediate agents' reasoning outputs and models their semantic correlations through the proposed global bias disentanglement and LLM-guided weak supervision. Extensive experiments on MAS with optimized communication topologies demonstrate the effectiveness of CIA, achieving an average AUC of 0.87 and a peak AUC of up to 0.99, thereby revealing the substantial privacy risk in MAS.

LGNov 10, 2025
Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time

Junjun Pan, Yixin Liu, Chuan Zhou et al.

Graph anomaly detection (GAD), which aims to detect outliers in graph-structured data, has received increasing research attention recently. However, existing GAD methods assume identical training and testing distributions, which is rarely valid in practice. In real-world scenarios, unseen but normal samples may emerge during deployment, leading to a normality shift that degrades the performance of GAD models trained on the original data. Through empirical analysis, we reveal that the degradation arises from (1) semantic confusion, where unseen normal samples are misinterpreted as anomalies due to their novel patterns, and (2) aggregation contamination, where the representations of seen normal nodes are distorted by unseen normals through message aggregation. While retraining or fine-tuning GAD models could be a potential solution to the above challenges, the high cost of model retraining and the difficulty of obtaining labeled data often render this approach impractical in real-world applications. To bridge the gap, we proposed a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns (TUNE) in GAD. To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level. Moreover, we utilize the minimization of representation-level shift as a supervision signal to train the aligner, which leverages the estimated aggregation contamination as a key indicator of normality shift. Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.

CLJan 28
MuVaC: AVariational Causal Framework for Multimodal Sarcasm Understanding in Dialogues

Diandian Guo, Fangfang Yuan, Cong Cao et al.

The prevalence of sarcasm in multimodal dialogues on the social platforms presents a crucial yet challenging task for understanding the true intent behind online content. Comprehensive sarcasm analysis requires two key aspects: Multimodal Sarcasm Detection (MSD) and Multimodal Sarcasm Explanation (MuSE). Intuitively, the act of detection is the result of the reasoning process that explains the sarcasm. Current research predominantly focuses on addressing either MSD or MuSE as a single task. Even though some recent work has attempted to integrate these tasks, their inherent causal dependency is often overlooked. To bridge this gap, we propose MuVaC, a variational causal inference framework that mimics human cognitive mechanisms for understanding sarcasm, enabling robust multimodal feature learning to jointly optimize MSD and MuSE. Specifically, we first model MSD and MuSE from the perspective of structural causal models, establishing variational causal pathways to define the objectives for joint optimization. Next, we design an alignment-then-fusion approach to integrate multimodal features, providing robust fusion representations for sarcasm detection and explanation generation. Finally, we enhance the reasoning trustworthiness by ensuring consistency between detection results and explanations. Experimental results demonstrate the superiority of MuVaC in public datasets, offering a new perspective for understanding multimodal sarcasm.

CRApr 15
SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment

Xixun Lin, Yang Liu, Yancheng Chen et al.

The performance of large language model (LLM) agents depends critically on the execution harness, the system layer that orchestrates tool use, context management, and state persistence. Yet this same architectural centrality makes the harness a high-value attack surface: a single compromise at the harness level can cascade through the entire execution pipeline. We observe that existing security approaches suffer from structural mismatch, leaving them blind to harness-internal state and unable to coordinate across the different phases of agent operation. In this paper, we introduce \safeharness{}, a security architecture in which four proposed defense layers are woven directly into the agent lifecycle to address above significant limitations: adversarial context filtering at input processing, tiered causal verification at decision making, privilege-separated tool control at action execution, and safe rollback with adaptive degradation at state update. The proposed cross-layer mechanisms tie these layers together, escalating verification rigor, triggering rollbacks, and tightening tool privileges whenever sustained anomalies are detected. We evaluate \safeharness{} on benchmark datasets across diverse harness configurations, comparing against four security baselines under five attack scenarios spanning six threat categories. Compared to the unprotected baseline, \safeharness{} achieves an average reduction of approximately 38\% in UBR and 42\% in ASR, substantially lowering both the unsafe behavior rate and the attack success rate while preserving core task utility.

AIAug 27, 2024
CL4KGE: A Curriculum Learning Method for Knowledge Graph Embedding

Yang Liu, Chuan Zhou, Peng Zhang et al.

Knowledge graph embedding (KGE) constitutes a foundational task, directed towards learning representations for entities and relations within knowledge graphs (KGs), with the objective of crafting representations comprehensive enough to approximate the logical and symbolic interconnections among entities. In this paper, we define a metric Z-counts to measure the difficulty of training each triple ($<$head entity, relation, tail entity$>$) in KGs with theoretical analysis. Based on this metric, we propose \textbf{CL4KGE}, an efficient \textbf{C}urriculum \textbf{L}earning based training strategy for \textbf{KGE}. This method includes a difficulty measurer and a training scheduler that aids in the training of KGE models. Our approach possesses the flexibility to act as a plugin within a wide range of KGE models, with the added advantage of adaptability to the majority of KGs in existence. The proposed method has been evaluated on popular KGE models, and the results demonstrate that it enhances the state-of-the-art methods. The use of Z-counts as a metric has enabled the identification of challenging triples in KGs, which helps in devising effective training strategies.

DSMay 1
Sparse Neighborhood Graph-Based Approximate Nearest Neighbor Search Revisited: Theoretical Analysis and Optimization

Xinran Ma, Zhaoqi Zhou, Chuan Zhou et al.

Graph-based approaches to approximate nearest neighbor search (ANNS) enable fast, high-recall retrieval on billion-scale vector datasets. Among them, the Sparse Neighborhood Graph (SNG) is widely used due to its strong search performance. However, the lack of theoretical understanding of SNG leads to expensive tuning of the truncation parameter that controls graph sparsification. In this work, we present OPT-SNG, a principled framework for analyzing and optimizing SNG construction. We introduce a martingale-based model of the pruning process that characterizes the stochastic evolution of candidate sets during graph construction. Using this framework, we prove that SNG has a maximum out-degree of \(O(n^{2/3+ε})\), where \(ε>0\) is an arbitrarily small constant, and an expected search path length of \(O(\log n)\). Building on these insights, we derive a closed-form rule for selecting the optimal truncation parameter \(R\), thereby eliminating the need for costly parameter sweeping. Extensive experiments on real-world datasets demonstrate that OPT-SNG achieves an average \(5.9\times\) speedup in index construction time, with peak improvements reaching \(15.4\times\), while consistently maintaining or improving search performance.

CLJun 1, 2024Code
Phased Instruction Fine-Tuning for Large Language Models

Wei Pang, Chuan Zhou, Xiao-Hua Zhou et al.

Instruction Fine-Tuning enhances pre-trained language models from basic next-word prediction to complex instruction-following. However, existing One-off Instruction Fine-Tuning (One-off IFT) method, applied on a diverse instruction, may not effectively boost models' adherence to instructions due to the simultaneous handling of varying instruction complexities. To improve this, Phased Instruction Fine-Tuning (Phased IFT) is proposed, based on the idea that learning to follow instructions is a gradual process. It assesses instruction difficulty using GPT-4, divides the instruction data into subsets of increasing difficulty, and uptrains the model sequentially on these subsets. Experiments with Llama-2 7B/13B/70B, Llama3 8/70B and Mistral-7B models using Alpaca data show that Phased IFT significantly outperforms One-off IFT, supporting the progressive alignment hypothesis and providing a simple and efficient way to enhance large language models. Codes and datasets from our experiments are freely available at https://github.com/xubuvd/PhasedSFT.

LGJun 14, 2021Code
A Comprehensive Survey on Graph Anomaly Detection with Deep Learning

Xiaoxiao Ma, Jia Wu, Shan Xue et al.

Anomalies represent rare observations (e.g., data records or events) that deviate significantly from others. Over several decades, research on anomaly mining has received increasing interests due to the implications of these occurrences in a wide range of disciplines. Anomaly detection, which aims to identify rare observations, is among the most vital tasks in the world, and has shown its power in preventing detrimental events, such as financial fraud, network intrusion, and social spam. The detection task is typically solved by identifying outlying data points in the feature space and inherently overlooks the relational information in real-world data. Graphs have been prevalently used to represent the structural information, which raises the graph anomaly detection problem - identifying anomalous graph objects (i.e., nodes, edges and sub-graphs) in a single graph, or anomalous graphs in a database/set of graphs. However, conventional anomaly detection techniques cannot tackle this problem well because of the complexity of graph data. For the advent of deep learning, graph anomaly detection with deep learning has received a growing attention recently. In this survey, we aim to provide a systematic and comprehensive review of the contemporary deep learning techniques for graph anomaly detection. We compile open-sourced implementations, public datasets, and commonly-used evaluation metrics to provide affluent resources for future studies. More importantly, we highlight twelve extensive future research directions according to our survey results covering unsolved and emerging research problems and real-world applications. With this survey, our goal is to create a "one-stop-shop" that provides a unified understanding of the problem categories and existing approaches, publicly available hands-on resources, and high-impact open challenges for graph anomaly detection using deep learning.

SIMay 26, 2021Code
A Comprehensive Survey on Community Detection with Deep Learning

Xing Su, Shan Xue, Fanzhen Liu et al.

A community reveals the features and connections of its members that are different from those in other communities in a network. Detecting communities is of great significance in network analysis. Despite the classical spectral clustering and statistical inference methods, we notice a significant development of deep learning techniques for community detection in recent years with their advantages in handling high dimensional network data. Hence, a comprehensive overview of community detection's latest progress through deep learning is timely to academics and practitioners. This survey devises and proposes a new taxonomy covering different state-of-the-art methods, including deep learning-based models upon deep neural networks, deep nonnegative matrix factorization and deep sparse filtering. The main category, i.e., deep neural networks, is further divided into convolutional networks, graph attention networks, generative adversarial networks and autoencoders. The survey also summarizes the popular benchmark data sets, evaluation metrics, and open-source implementations to address experimentation settings. We then discuss the practical applications of community detection in various domains and point to implementation scenarios. Finally, we outline future directions by suggesting challenging topics in this fast-growing deep learning field.

LGMay 12, 2021Code
Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning

Ming Jin, Yizhen Zheng, Yuan-Fang Li et al.

Graph representation learning plays a vital role in processing graph-structured data. However, prior arts on graph representation learning heavily rely on labeling information. To overcome this problem, inspired by the recent success of graph contrastive learning and Siamese networks in visual representation learning, we propose a novel self-supervised approach in this paper to learn node representations by enhancing Siamese self-distillation with multi-scale contrastive learning. Specifically, we first generate two augmented views from the input graph based on local and global perspectives. Then, we employ two objectives called cross-view and cross-network contrastiveness to maximize the agreement between node representations across different views and networks. To demonstrate the effectiveness of our approach, we perform empirical experiments on five real-world datasets. Our method not only achieves new state-of-the-art results but also surpasses some semi-supervised counterparts by large margins. Code is made available at https://github.com/GRAND-Lab/MERIT

LGFeb 27, 2021Code
Graph Self-Supervised Learning: A Survey

Yixin Liu, Ming Jin, Shirui Pan et al.

Deep learning on graphs has attracted significant interests recently. However, most of the works have focused on (semi-) supervised learning, resulting in shortcomings including heavy label reliance, poor generalization, and weak robustness. To address these issues, self-supervised learning (SSL), which extracts informative knowledge through well-designed pretext tasks without relying on manual labels, has become a promising and trending learning paradigm for graph data. Different from SSL on other domains like computer vision and natural language processing, SSL on graphs has an exclusive background, design ideas, and taxonomies. Under the umbrella of graph self-supervised learning, we present a timely and comprehensive review of the existing approaches which employ SSL techniques for graph data. We construct a unified framework that mathematically formalizes the paradigm of graph SSL. According to the objectives of pretext tasks, we divide these approaches into four categories: generation-based, auxiliary property-based, contrast-based, and hybrid approaches. We further describe the applications of graph SSL across various research fields and summarize the commonly used datasets, evaluation benchmark, performance comparison and open-source codes of graph SSL. Finally, we discuss the remaining challenges and potential future directions in this research field.

CVMar 16
Nodule-Aligned Latent Space Learning with LLM-Driven Multimodal Diffusion for Lung Nodule Progression Prediction

James Song, Yifan Wang, Chuan Zhou et al.

Early diagnosis of lung cancer is challenging due to biological uncertainty and the limited understanding of the biological mechanisms driving nodule progression. To address this, we propose Nodule-Aligned Multimodal (Latent) Diffusion (NAMD), a novel framework that predicts lung nodule progression by generating 1-year follow-up nodule computed tomography images with baseline scans and the patient's and nodule's Electronic Health Record (EHR). NAMD introduces a nodule-aligned latent space, where distances between latents directly correspond to changes in nodule attributes, and utilizes an LLM-driven control mechanism to condition the diffusion backbone on patient data. On the National Lung Screening Trial (NLST) dataset, our method synthesizes follow-up nodule images that achieve an AUROC of 0.805 and an AUPRC of 0.346 for lung nodule malignancy prediction, significantly outperforming both baseline scans and state-of-the-art synthesis methods, while closely approaching the performance of real follow-up scans (AUROC: 0.819, AUPRC: 0.393). These results demonstrate that NAMD captures clinically relevant features of lung nodule progression, facilitating earlier and more accurate diagnosis.

AISep 23, 2025
LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions

Xixun Lin, Yucheng Ning, Jingwen Zhang et al.

Driven by the rapid advancements of Large Language Models (LLMs), LLM-based agents have emerged as powerful intelligent systems capable of human-like cognition, reasoning, and interaction. These agents are increasingly being deployed across diverse real-world applications, including student education, scientific research, and financial analysis. However, despite their remarkable potential, LLM-based agents remain vulnerable to hallucination issues, which can result in erroneous task execution and undermine the reliability of the overall system design. Addressing this critical challenge requires a deep understanding and a systematic consolidation of recent advances on LLM-based agents. To this end, we present the first comprehensive survey of hallucinations in LLM-based agents. By carefully analyzing the complete workflow of agents, we propose a new taxonomy that identifies different types of agent hallucinations occurring at different stages. Furthermore, we conduct an in-depth examination of eighteen triggering causes underlying the emergence of agent hallucinations. Through a detailed review of a large number of existing studies, we summarize approaches for hallucination mitigation and detection, and highlight promising directions for future research. We hope this survey will inspire further efforts toward addressing hallucinations in LLM-based agents, ultimately contributing to the development of more robust and reliable agent systems.

IVApr 12, 2025
Multi-Modal Brain Tumor Segmentation via 3D Multi-Scale Self-attention and Cross-attention

Yonghao Huang, Leiting Chen, Chuan Zhou

Due to the success of CNN-based and Transformer-based models in various computer vision tasks, recent works study the applicability of CNN-Transformer hybrid architecture models in 3D multi-modality medical segmentation tasks. Introducing Transformer brings long-range dependent information modeling ability in 3D medical images to hybrid models via the self-attention mechanism. However, these models usually employ fixed receptive fields of 3D volumetric features within each self-attention layer, ignoring the multi-scale volumetric lesion features. To address this issue, we propose a CNN-Transformer hybrid 3D medical image segmentation model, named TMA-TransBTS, based on an encoder-decoder structure. TMA-TransBTS realizes simultaneous extraction of multi-scale 3D features and modeling of long-distance dependencies by multi-scale division and aggregation of 3D tokens in a self-attention layer. Furthermore, TMA-TransBTS proposes a 3D multi-scale cross-attention module to establish a link between the encoder and the decoder for extracting rich volume representations by exploiting the mutual attention mechanism of cross-attention and multi-scale aggregation of 3D tokens. Extensive experimental results on three public 3D medical segmentation datasets show that TMA-TransBTS achieves higher averaged segmentation results than previous state-of-the-art CNN-based 3D methods and CNN-Transform hybrid 3D methods for the segmentation of 3D multi-modality brain tumors.

AIFeb 1
Hard Constraints Meet Soft Generation: Guaranteed Feasibility for LLM-based Combinatorial Optimization

Yang Liu, Chuan Zhou, Yancheng Chen et al.

Large language models (LLMs) have emerged as promising general-purpose solvers for combinatorial optimization (CO), yet they fundamentally lack mechanisms to guarantee solution feasibility which is critical for real-world deployment. In this work, we introduce FALCON, a framework that ensures 100\% feasibility through three key innovations: (i) \emph{grammar-constrained decoding} enforces syntactic validity, (ii) a \emph{feasibility repair layer} corrects semantic constraint violations, and (iii) \emph{adaptive Best-of-$N$ sampling} allocates inference compute efficiently. To train the underlying LLM, we introduce the Best-anchored Objective-guided Preference Optimization (BOPO) in LLM training, which weights preference pairs by their objective gap, providing dense supervision without human labels. Theoretically, we prove convergence for BOPO and provide bounds on repair-induced quality loss. Empirically, across seven NP-hard CO problems, FALCON achieves perfect feasibility while matching or exceeding the solution quality of state-of-the-art neural and LLM-based solvers.

CVNov 22, 2025
Early Lung Cancer Diagnosis from Virtual Follow-up LDCT Generation via Correlational Autoencoder and Latent Flow Matching

Yutong Wu, Yifan Wang, Qining Zhang et al.

Lung cancer is one of the most commonly diagnosed cancers, and early diagnosis is critical because the survival rate declines sharply once the disease progresses to advanced stages. However, achieving an early diagnosis remains challenging, particularly in distinguishing subtle early signals of malignancy from those of benign conditions. In clinical practice, a patient with a high risk may need to undergo an initial baseline and several annual follow-up examinations (e.g., CT scans) before receiving a definitive diagnosis, which can result in missing the optimal treatment. Recently, Artificial Intelligence (AI) methods have been increasingly used for early diagnosis of lung cancer, but most existing algorithms focus on radiomic features extraction from single early-stage CT scans. Inspired by recent advances in diffusion models for image generation, this paper proposes a generative method, named CorrFlowNet, which creates a virtual, one-year follow-up CT scan after the initial baseline scan. This virtual follow-up would allow for an early detection of malignant/benign nodules, reducing the need to wait for clinical follow-ups. During training, our approach employs a correlational autoencoder to encode both early baseline and follow-up CT images into a latent space that captures the dynamics of nodule progression as well as the correlations between them, followed by a flow matching algorithm on the latent space with a neural ordinary differential equation. An auxiliary classifier is used to further enhance the diagnostic accuracy. Evaluations on a real clinical dataset show our method can significantly improve downstream lung nodule risk assessment compared with existing baseline models. Moreover, its diagnostic accuracy is comparable with real clinical CT follow-ups, highlighting its potential to improve cancer diagnosis.

CVApr 12, 2025
Multi-modal and Multi-view Fundus Image Fusion for Retinopathy Diagnosis via Multi-scale Cross-attention and Shifted Window Self-attention

Yonghao Huang, Leiting Chen, Chuan Zhou

The joint interpretation of multi-modal and multi-view fundus images is critical for retinopathy prevention, as different views can show the complete 3D eyeball field and different modalities can provide complementary lesion areas. Compared with single images, the sequence relationships in multi-modal and multi-view fundus images contain long-range dependencies in lesion features. By modeling the long-range dependencies in these sequences, lesion areas can be more comprehensively mined, and modality-specific lesions can be detected. To learn the long-range dependency relationship and fuse complementary multi-scale lesion features between different fundus modalities, we design a multi-modal fundus image fusion method based on multi-scale cross-attention, which solves the static receptive field problem in previous multi-modal medical fusion methods based on attention. To capture multi-view relative positional relationships between different views and fuse comprehensive lesion features between different views, we design a multi-view fundus image fusion method based on shifted window self-attention, which also solves the computational complexity of the multi-view fundus fusion method based on self-attention is quadratic to the size and number of multi-view fundus images. Finally, we design a multi-task retinopathy diagnosis framework to help ophthalmologists reduce workload and improve diagnostic accuracy by combining the proposed two fusion methods. The experimental results of retinopathy classification and report generation tasks indicate our method's potential to improve the efficiency and reliability of retinopathy diagnosis in clinical practice, achieving a classification accuracy of 82.53\% and a report generation BlEU-1 of 0.543.

LGJan 15, 2025
A Partial Initialization Strategy to Mitigate the Overfitting Problem in CATE Estimation with Hidden Confounding

Chuan Zhou, Yaxuan Li, Chunyuan Zheng et al.

Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics. Existing studies mainly rely on the strong ignorability assumption that there are no hidden confounders, whose existence cannot be tested from observational data and can invalidate any causal conclusion. In contrast, data collected from randomized controlled trials (RCT) do not suffer from confounding but are usually limited by a small sample size. To avoid overfitting caused by the small-scale RCT data, we propose a novel two-stage pretraining-finetuning (TSPF) framework with a partial parameter initialization strategy to estimate the CATE in the presence of hidden confounding. In the first stage, a foundational representation of covariates is trained to estimate counterfactual outcomes through large-scale observational data. In the second stage, we propose to train an augmented representation of the covariates, which is concatenated with the foundational representation obtained in the first stage to adjust for the hidden confounding. Rather than training a separate network from scratch, part of the prediction heads are initialized from the first stage. The superiority of our approach is validated on two datasets with extensive experiments.

LGJun 5, 2024
Decision-focused Graph Neural Networks for Combinatorial Optimization

Yang Liu, Chuan Zhou, Peng Zhang et al.

In recent years, there has been notable interest in investigating combinatorial optimization (CO) problems by neural-based framework. An emerging strategy to tackle these challenging problems involves the adoption of graph neural networks (GNNs) as an alternative to traditional algorithms, a subject that has attracted considerable attention. Despite the growing popularity of GNNs and traditional algorithm solvers in the realm of CO, there is limited research on their integrated use and the correlation between them within an end-to-end framework. The primary focus of our work is to formulate a more efficient and precise framework for CO by employing decision-focused learning on graphs. Additionally, we introduce a decision-focused framework that utilizes GNNs to address CO problems with auxiliary support. To realize an end-to-end approach, we have designed two cascaded modules: (a) an unsupervised trained graph predictive model, and (b) a solver for quadratic binary unconstrained optimization. Empirical evaluations are conducted on various classical tasks, including maximum cut, maximum independent set, and minimum vertex cover. The experimental results on classical CO problems (i.e. MaxCut, MIS, and MVC) demonstrate the superiority of our method over both the standalone GNN approach and classical methods.

LGJun 5, 2024
Combinatorial Optimization with Automated Graph Neural Networks

Yang Liu, Peng Zhang, Yang Gao et al.

In recent years, graph neural networks (GNNs) have become increasingly popular for solving NP-hard combinatorial optimization (CO) problems, such as maximum cut and maximum independent set. The core idea behind these methods is to represent a CO problem as a graph and then use GNNs to learn the node/graph embedding with combinatorial information. Although these methods have achieved promising results, given a specific CO problem, the design of GNN architectures still requires heavy manual work with domain knowledge. Existing automated GNNs are mostly focused on traditional graph learning problems, which is inapplicable to solving NP-hard CO problems. To this end, we present a new class of \textbf{AUTO}mated \textbf{G}NNs for solving \textbf{NP}-hard problems, namely \textbf{AutoGNP}. We represent CO problems by GNNs and focus on two specific problems, i.e., mixed integer linear programming and quadratic unconstrained binary optimization. The idea of AutoGNP is to use graph neural architecture search algorithms to automatically find the best GNNs for a given NP-hard combinatorial optimization problem. Compared with existing graph neural architecture search algorithms, AutoGNP utilizes two-hop operators in the architecture search space. Moreover, AutoGNP utilizes simulated annealing and a strict early stopping policy to avoid local optimal solutions. Empirical results on benchmark combinatorial problems demonstrate the superiority of our proposed model.

LGFeb 25, 2022
Projective Ranking-based GNN Evasion Attacks

He Zhang, Xingliang Yuan, Chuan Zhou et al.

Graph neural networks (GNNs) offer promising learning methods for graph-related tasks. However, GNNs are at risk of adversarial attacks. Two primary limitations of the current evasion attack methods are highlighted: (1) The current GradArgmax ignores the "long-term" benefit of the perturbation. It is faced with zero-gradient and invalid benefit estimates in certain situations. (2) In the reinforcement learning-based attack methods, the learned attack strategies might not be transferable when the attack budget changes. To this end, we first formulate the perturbation space and propose an evaluation framework and the projective ranking method. We aim to learn a powerful attack strategy then adapt it as little as possible to generate adversarial samples under dynamic budget settings. In our method, based on mutual information, we rank and assess the attack benefits of each perturbation for an effective attack strategy. By projecting the strategy, our method dramatically minimizes the cost of learning a new attack strategy when the attack budget changes. In the comparative assessment with GradArgmax and RL-S2V, the results show our method owns high attack performance and effective transferability. The visualization of our method also reveals various attack patterns in the generation of adversarial samples.

LGJun 6, 2021
Pseudo-Riemannian Graph Convolutional Networks

Bo Xiong, Shichao Zhu, Nico Potyka et al.

Graph convolutional networks (GCNs) are powerful frameworks for learning embeddings of graph-structured data. GCNs are traditionally studied through the lens of Euclidean geometry. Recent works find that non-Euclidean Riemannian manifolds provide specific inductive biases for embedding hierarchical or spherical data. However, they cannot align well with data of mixed graph topologies. We consider a larger class of pseudo-Riemannian manifolds that generalize hyperboloid and sphere. We develop new geodesic tools that allow for extending neural network operations into geodesically disconnected pseudo-Riemannian manifolds. As a consequence, we derive a pseudo-Riemannian GCN that models data in pseudo-Riemannian manifolds of constant nonzero curvature in the context of graph neural networks. Our method provides a geometric inductive bias that is sufficiently flexible to model mixed heterogeneous topologies like hierarchical graphs with cycles. We demonstrate the representational capabilities of this method by applying it to the tasks of graph reconstruction, node classification and link prediction on a series of standard graphs with mixed topologies. Empirical results demonstrate that our method outperforms Riemannian counterparts when embedding graphs of complex topologies.

LGFeb 27, 2021
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning

Yixin Liu, Zhao Li, Shirui Pan et al.

Anomaly detection on attributed networks attracts considerable research interests due to wide applications of attributed networks in modeling a wide range of complex systems. Recently, the deep learning-based anomaly detection methods have shown promising results over shallow approaches, especially on networks with high-dimensional attributes and complex structures. However, existing approaches, which employ graph autoencoder as their backbone, do not fully exploit the rich information of the network, resulting in suboptimal performance. Furthermore, these methods do not directly target anomaly detection in their learning objective and fail to scale to large networks due to the full graph training mechanism. To overcome these limitations, in this paper, we present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair, which can capture the relationship between each node and its neighboring substructure in an unsupervised way. Meanwhile, a well-designed graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure and measure the agreement of each instance pairs with its outputted scores. The multi-round predicted scores by the contrastive learning model are further used to evaluate the abnormality of each node with statistical estimation. In this way, the learning model is trained by a specific anomaly detection-aware target. Furthermore, since the input of the graph neural network module is batches of instance pairs instead of the full network, our framework can adapt to large networks flexibly. Experimental results show that our proposed framework outperforms the state-of-the-art baseline methods on all seven benchmark datasets.

IRFeb 26, 2021
Task-adaptive Neural Process for User Cold-Start Recommendation

Xixun Lin, Jia Wu, Chuan Zhou et al.

User cold-start recommendation is a long-standing challenge for recommender systems due to the fact that only a few interactions of cold-start users can be exploited. Recent studies seek to address this challenge from the perspective of meta learning, and most of them follow a manner of parameter initialization, where the model parameters can be learned by a few steps of gradient updates. While these gradient-based meta-learning models achieve promising performances to some extent, a fundamental problem of them is how to adapt the global knowledge learned from previous tasks for the recommendations of cold-start users more effectively. In this paper, we develop a novel meta-learning recommender called task-adaptive neural process (TaNP). TaNP is a new member of the neural process family, where making recommendations for each user is associated with a corresponding stochastic process. TaNP directly maps the observed interactions of each user to a predictive distribution, sidestepping some training issues in gradient-based meta-learning models. More importantly, to balance the trade-off between model capacity and adaptation reliability, we introduce a novel task-adaptive mechanism. It enables our model to learn the relevance of different tasks and customize the global knowledge to the task-related decoder parameters for estimating user preferences. We validate TaNP on multiple benchmark datasets in different experimental settings. Empirical results demonstrate that TaNP yields consistent improvements over several state-of-the-art meta-learning recommenders.

LGOct 23, 2020
Graph Geometry Interaction Learning

Shichao Zhu, Shirui Pan, Chuan Zhou et al.

While numerous approaches have been developed to embed graphs into either Euclidean or hyperbolic spaces, they do not fully utilize the information available in graphs, or lack the flexibility to model intrinsic complex graph geometry. To utilize the strength of both Euclidean and hyperbolic geometries, we develop a novel Geometry Interaction Learning (GIL) method for graphs, a well-suited and efficient alternative for learning abundant geometric properties in graph. GIL captures a more informative internal structural features with low dimensions while maintaining conformal invariance of each space. Furthermore, our method endows each node the freedom to determine the importance of each geometry space via a flexible dual feature interaction learning and probability assembling mechanism. Promising experimental results are presented for five benchmark datasets on node classification and link prediction tasks.

SIMay 17, 2020
Deep Learning for Community Detection: Progress, Challenges and Opportunities

Fanzhen Liu, Shan Xue, Jia Wu et al.

As communities represent similar opinions, similar functions, similar purposes, etc., community detection is an important and extremely useful tool in both scientific inquiry and data analytics. However, the classic methods of community detection, such as spectral clustering and statistical inference, are falling by the wayside as deep learning techniques demonstrate an increasing capacity to handle high-dimensional graph data with impressive performance. Thus, a survey of current progress in community detection through deep learning is timely. Structured into three broad research streams in this domain - deep neural networks, deep graph embedding, and graph neural networks, this article summarizes the contributions of the various frameworks, models, and algorithms in each stream along with the current challenges that remain unsolved and the future research opportunities yet to be explored.

LGJun 18, 2019
Deep Active Learning for Anchor User Prediction

Anfeng Cheng, Chuan Zhou, Hong Yang et al.

Predicting pairs of anchor users plays an important role in the cross-network analysis. Due to the expensive costs of labeling anchor users for training prediction models, we consider in this paper the problem of minimizing the number of user pairs across multiple networks for labeling as to improve the accuracy of the prediction. To this end, we present a deep active learning model for anchor user prediction (DALAUP for short). However, active learning for anchor user sampling meets the challenges of non-i.i.d. user pair data caused by network structures and the correlation among anchor or non-anchor user pairs. To solve the challenges, DALAUP uses a couple of neural networks with shared-parameter to obtain the vector representations of user pairs, and ensembles three query strategies to select the most informative user pairs for labeling and model training. Experiments on real-world social network data demonstrate that DALAUP outperforms the state-of-the-art approaches.

LGApr 22, 2019
GraphNAS: Graph Neural Architecture Search with Reinforcement Learning

Yang Gao, Hong Yang, Peng Zhang et al.

Graph Neural Networks (GNNs) have been popularly used for analyzing non-Euclidean data such as social network data and biological data. Despite their success, the design of graph neural networks requires a lot of manual work and domain knowledge. In this paper, we propose a Graph Neural Architecture Search method (GraphNAS for short) that enables automatic search of the best graph neural architecture based on reinforcement learning. Specifically, GraphNAS first uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and then trains the recurrent network with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation data set. Extensive experimental results on node classification tasks in both transductive and inductive learning settings demonstrate that GraphNAS can achieve consistently better performance on the Cora, Citeseer, Pubmed citation network, and protein-protein interaction network. On node classification tasks, GraphNAS can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy.