Jiajun Zhou

LG
h-index14
34papers
295citations
Novelty51%
AI Score54

34 Papers

LGJun 23, 2023Code
PathMLP: Smooth Path Towards High-order Homophily

Jiajun Zhou, Chenxuan Xie, Shengbo Gong et al.

Real-world graphs exhibit increasing heterophily, where nodes no longer tend to be connected to nodes with the same label, challenging the homophily assumption of classical graph neural networks (GNNs) and impeding their performance. Intriguingly, from the observation of heterophilous data, we notice that certain high-order information exhibits higher homophily, which motivates us to involve high-order information in node representation learning. However, common practices in GNNs to acquire high-order information mainly through increasing model depth and altering message-passing mechanisms, which, albeit effective to a certain extent, suffer from three shortcomings: 1) over-smoothing due to excessive model depth and propagation times; 2) high-order information is not fully utilized; 3) low computational efficiency. In this regard, we design a similarity-based path sampling strategy to capture smooth paths containing high-order homophily. Then we propose a lightweight model based on multi-layer perceptrons (MLP), named PathMLP, which can encode messages carried by paths via simple transformation and concatenation operations, and effectively learn node representations in heterophilous graphs through adaptive path aggregation. Extensive experiments demonstrate that our method outperforms baselines on 16 out of 20 datasets, underlining its effectiveness and superiority in alleviating the heterophily problem. In addition, our method is immune to over-smoothing and has high computational efficiency. The source code will be available in https://github.com/Graph4Sec-Team/PathMLP.

LGAug 1, 2024Code
Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision

Chenxiang Jin, Jiajun Zhou, Chenxuan Xie et al.

The rampant fraudulent activities on Ethereum hinder the healthy development of the blockchain ecosystem, necessitating the reinforcement of regulations. However, multiple imbalances involving account interaction frequencies and interaction types in the Ethereum transaction environment pose significant challenges to data mining-based fraud detection research. To address this, we first propose the concept of meta-interactions to refine interaction behaviors in Ethereum, and based on this, we present a dual self-supervision enhanced Ethereum fraud detection framework, named Meta-IFD. This framework initially introduces a generative self-supervision mechanism to augment the interaction features of accounts, followed by a contrastive self-supervision mechanism to differentiate various behavior patterns, and ultimately characterizes the behavioral representations of accounts and mines potential fraud risks through multi-view interaction feature learning. Extensive experiments on real Ethereum datasets demonstrate the effectiveness and superiority of our framework in detecting common Ethereum fraud behaviors such as Ponzi schemes and phishing scams. Additionally, the generative module can effectively alleviate the interaction distribution imbalance in Ethereum data, while the contrastive module significantly enhances the framework's ability to distinguish different behavior patterns. The source code will be available in https://github.com/GISec-Team/Meta-IFD.

LGJun 4, 2023Code
Clarify Confused Nodes via Separated Learning

Jiajun Zhou, Shengbo Gong, Xuanze Chen et al.

Graph neural networks (GNNs) have achieved remarkable advances in graph-oriented tasks. However, real-world graphs invariably contain a certain proportion of heterophilous nodes, challenging the homophily assumption of traditional GNNs and hindering their performance. Most existing studies continue to design generic models with shared weights between heterophilous and homophilous nodes. Despite the incorporation of high-order messages or multi-channel architectures, these efforts often fall short. A minority of studies attempt to train different node groups separately but suffer from inappropriate separation metrics and low efficiency. In this paper, we first propose a new metric, termed Neighborhood Confusion (NC), to facilitate a more reliable separation of nodes. We observe that node groups with different levels of NC values exhibit certain differences in intra-group accuracy and visualized embeddings. These pave the way for Neighborhood Confusion-guided Graph Convolutional Network (NCGCN), in which nodes are grouped by their NC values and accept intra-group weight sharing and message passing. Extensive experiments on both homophilous and heterophilous benchmarks demonstrate that our framework can effectively separate nodes and yield significant performance improvement compared to the latest methods. The source code will be available in https://github.com/GISec-Team/NCGNN.

LGFeb 24, 2023
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference

Jiajun Zhou, Jiajun Wu, Yizhao Gao et al.

To accelerate the inference of deep neural networks (DNNs), quantization with low-bitwidth numbers is actively researched. A prominent challenge is to quantize the DNN models into low-bitwidth numbers without significant accuracy degradation, especially at very low bitwidths (< 8 bits). This work targets an adaptive data representation with variable-length encoding called DyBit. DyBit can dynamically adjust the precision and range of separate bit-field to be adapted to the DNN weights/activations distribution. We also propose a hardware-aware quantization framework with a mixed-precision accelerator to trade-off the inference accuracy and speedup. Experimental results demonstrate that the inference accuracy via DyBit is 1.997% higher than the state-of-the-art at 4-bit quantization, and the proposed framework can achieve up to 8.1x speedup compared with the original model.

LGAug 13, 2022
PECAN: A Product-Quantized Content Addressable Memory Network

Jie Ran, Rui Lin, Jason Chun Lok Li et al.

A novel deep neural network (DNN) architecture is proposed wherein the filtering and linear transform are realized solely with product quantization (PQ). This results in a natural implementation via content addressable memory (CAM), which transcends regular DNN layer operations and requires only simple table lookup. Two schemes are developed for the end-to-end PQ prototype training, namely, through angle- and distance-based similarities, which differ in their multiplicative and additive natures with different complexity-accuracy tradeoffs. Even more, the distance-based scheme constitutes a truly multiplier-free DNN solution. Experiments confirm the feasibility of such Product-Quantized Content Addressable Memory Network (PECAN), which has strong implication on hardware-efficient deployments especially for in-memory computing.

LGDec 20, 2022
Data Augmentation on Graphs: A Technical Survey

Jiajun Zhou, Chenxuan Xie, Shengbo Gong et al.

In recent years, graph representation learning has achieved remarkable success while suffering from low-quality data problems. As a mature technology to improve data quality in computer vision, data augmentation has also attracted increasing attention in graph domain. To advance research in this emerging direction, this survey provides a comprehensive review and summary of existing graph data augmentation (GDAug) techniques. Specifically, this survey first provides an overview of various feasible taxonomies and categorizes existing GDAug studies based on multi-scale graph elements. Subsequently, for each type of GDAug technique, this survey formalizes standardized technical definition, discuss the technical details, and provide schematic illustration. The survey also reviews domain-specific graph data augmentation techniques, including those for heterogeneous graphs, temporal graphs, spatio-temporal graphs, and hypergraphs. In addition, this survey provides a summary of available evaluation metrics and design guidelines for graph data augmentation. Lastly, it outlines the applications of GDAug at both the data and model levels, discusses open issues in the field, and looks forward to future directions. The latest advances in GDAug are summarized in GitHub.

LGOct 30, 2022
Time-aware Metapath Feature Augmentation for Ponzi Detection in Ethereum

Chengxiang Jin, Jiajun Zhou, Jie Jin et al.

With the development of Web 3.0 which emphasizes decentralization, blockchain technology ushers in its revolution and also brings numerous challenges, particularly in the field of cryptocurrency. Recently, a large number of criminal behaviors continuously emerge on blockchain, such as Ponzi schemes and phishing scams, which severely endanger decentralized finance. Existing graph-based abnormal behavior detection methods on blockchain usually focus on constructing homogeneous transaction graphs without distinguishing the heterogeneity of nodes and edges, resulting in partial loss of transaction pattern information. Although existing heterogeneous modeling methods can depict richer information through metapaths, the extracted metapaths generally neglect temporal dependencies between entities and do not reflect real behavior. In this paper, we introduce Time-aware Metapath Feature Augmentation (TMFAug) as a plug-and-play module to capture the real metapath-based transaction patterns during Ponzi scheme detection on Ethereum. The proposed module can be adaptively combined with existing graph-based Ponzi detection methods. Extensive experimental results show that our TMFAug can help existing Ponzi detection methods achieve significant performance improvements on the Ethereum dataset, indicating the effectiveness of heterogeneous temporal information for Ponzi scheme detection.

LGJan 24, 2023
Neighborhood Homophily-based Graph Convolutional Network

Shengbo Gong, Jiajun Zhou, Chenxuan Xie et al.

Graph neural networks (GNNs) have been proved powerful in graph-oriented tasks. However, many real-world graphs are heterophilous, challenging the homophily assumption of classical GNNs. To solve the universality problem, many studies deepen networks or concatenate intermediate representations, which does not inherently change neighbor aggregation and introduces noise. Recent studies propose new metrics to characterize the homophily, but rarely consider the correlation of the proposed metrics and models. In this paper, we first design a new metric, Neighborhood Homophily (\textit{NH}), to measure the label complexity or purity in node neighborhoods. Furthermore, we incorporate the metric into the classical graph convolutional network (GCN) architecture and propose \textbf{N}eighborhood \textbf{H}omophily-based \textbf{G}raph \textbf{C}onvolutional \textbf{N}etwork (\textbf{NHGCN}). In this framework, neighbors are grouped by estimated \textit{NH} values and aggregated from different channels, and the resulting node predictions are then used in turn to estimate and update \textit{NH} values. The two processes of metric estimation and model inference are alternately optimized to achieve better node classification. NHGCN achieves top overall performance on both homophilous and heterophilous benchmarks, with an improvement of up to 7.4\% compared to the current SOTA methods.

SOFTMar 2Code
PolyMon: A Unified Framework for Polymer Property Prediction

Gaopeng Ren, Yijie Yang, Jiajun Zhou et al.

Accurate prediction of polymer properties is essential for materials design, but remains challenging due to data scarcity, diverse polymer representations, and the lack of systematic evaluation across modelling choices. Here, we present PolyMon, a unified and accessible framework that integrates multiple polymer representations, machine learning methods, and training strategies within a single, accessible platform. PolyMon supports various descriptors and graph construction strategies for polymer representations, and includes a wide range of models, from tabular models to graph neural networks, along with flexible training strategies including multi-fidelity learning, Δ-learning, active learning, and ensemble learning. Using five key polymer properties as benchmarks, we perform systematic evaluations to assess how representations and models affect predictive performance. These case studies further illustrate how different training strategies can be applied within a consistent workflow to leverage limited data and incorporate physical model derived information. Overall, PolyMon provides a comprehensive and extensible foundation for benchmarking and advancing machine learning-based polymer property prediction. The code is available at github.com/fate1997/polymon.

CLMay 7, 2022
SubGraph Networks based Entity Alignment for Cross-lingual Knowledge Graph

Shanqing Yu, Shihan Zhang, Jianlin Zhang et al.

Entity alignment is the task of finding entities representing the same real-world object in two knowledge graphs(KGs). Cross-lingual knowledge graph entity alignment aims to discover the cross-lingual links in the multi-language KGs, which is of great significance to the NLP applications and multi-language KGs fusion. In the task of aligning cross-language knowledge graphs, the structures of the two graphs are very similar, and the equivalent entities often have the same subgraph structure characteristics. The traditional GCN method neglects to obtain structural features through representative parts of the original graph and the use of adjacency matrix is not enough to effectively represent the structural features of the graph. In this paper, we introduce the subgraph network (SGN) method into the GCN-based cross-lingual KG entity alignment method. In the method, we extracted the first-order subgraphs of the KGs to expand the structural features of the original graph to enhance the representation ability of the entity embedding and improve the alignment accuracy. Experiments show that the proposed method outperforms the state-of-the-art GCN-based method.

LGSep 12, 2024
Network Anomaly Traffic Detection via Multi-view Feature Fusion

Song Hao, Wentao Fu, Xuanze Chen et al.

Traditional anomalous traffic detection methods are based on single-view analysis, which has obvious limitations in dealing with complex attacks and encrypted communications. In this regard, we propose a Multi-view Feature Fusion (MuFF) method for network anomaly traffic detection. MuFF models the temporal and interactive relationships of packets in network traffic based on the temporal and interactive viewpoints respectively. It learns temporal and interactive features. These features are then fused from different perspectives for anomaly traffic detection. Extensive experiments on six real traffic datasets show that MuFF has excellent performance in network anomalous traffic detection, which makes up for the shortcomings of detection under a single perspective.

STApr 28, 2022
Cross Cryptocurrency Relationship Mining for Bitcoin Price Prediction

Panpan Li, Shengbo Gong, Shaocong Xu et al.

Blockchain finance has become a part of the world financial system, most typically manifested in the attention to the price of Bitcoin. However, a great deal of work is still limited to using technical indicators to capture Bitcoin price fluctuation, with little consideration of historical relationships and interactions between related cryptocurrencies. In this work, we propose a generic Cross-Cryptocurrency Relationship Mining module, named C2RM, which can effectively capture the synchronous and asynchronous impact factors between Bitcoin and related Altcoins. Specifically, we utilize the Dynamic Time Warping algorithm to extract the lead-lag relationship, yielding Lead-lag Variance Kernel, which will be used for aggregating the information of Altcoins to form relational impact factors. Comprehensive experimental results demonstrate that our C2RM can help existing price prediction methods achieve significant performance improvement, suggesting the effectiveness of Cross-Cryptocurrency interactions on benefitting Bitcoin price prediction.

58.0LGApr 13
Learning How Much to Think: Difficulty-Aware Dynamic MoEs for Graph Node Classification

Jiajun Zhou, Yadong Li, Xuanze Chen et al.

Mixture-of-Experts (MoE) architectures offer a scalable path for Graph Neural Networks (GNNs) in node classification tasks but typically rely on static and rigid routing strategies that enforce a uniform expert budget or coarse-grained expert toggles on all nodes. This limitation overlooks the varying discriminative difficulty of nodes and leads to under-fitting for hard nodes and redundant computation for easy ones. To resolve this issue, we propose D2MoE, a novel framework that shifts the focus from static expert selection to node-wise expert resource allocation. By using predictive entropy as a real-time proxy for difficulty, D2MoE employs a difficulty-driven top-p routing mechanism to adaptively concentrate expert resources on hard nodes while reducing overhead for easy ones, achieving continuous and fine-grained expert budget scaling for node classification. Experiments on 13 benchmarks demonstrate that D2MoE achieves consistent state-of-the-art performance, surpassing leading baselines by up to 7.92% in accuracy on heterophilous graphs. Notably, on large-scale graphs, it reduces memory consumption by up to 73.07% and training time by 46.53% compared to the best-performing Graph MoE, thereby validating its superior efficiency.

AINov 2, 2022
Discover Important Paths in the Knowledge Graph Based on Dynamic Relation Confidence

Shanqing Yu, Yijun Wu, Ran Gan et al.

Most of the existing knowledge graphs are not usually complete and can be complemented by some reasoning algorithms. The reasoning method based on path features is widely used in the field of knowledge graph reasoning and completion on account of that its have strong interpretability. However, reasoning methods based on path features still have several problems in the following aspects: Path search isinefficient, insufficient paths for sparse tasks and some paths are not helpful for reasoning tasks. In order to solve the above problems, this paper proposes a method called DC-Path that combines dynamic relation confidence and other indicators to evaluate path features, and then guide path search, finally conduct relation reasoning. Experimental result show that compared with the existing relation reasoning algorithm, this method can select the most representative features in the current reasoning task from the knowledge graph and achieve better performance on the current relation reasoning task.

IVDec 11, 2023Code
Hundred-Kilobyte Lookup Tables for Efficient Single-Image Super-Resolution

Binxiao Huang, Jason Chun Lok Li, Jie Ran et al.

Conventional super-resolution (SR) schemes make heavy use of convolutional neural networks (CNNs), which involve intensive multiply-accumulate (MAC) operations, and require specialized hardware such as graphics processing units. This contradicts the regime of edge AI that often runs on devices strained by power, computing, and storage resources. Such a challenge has motivated a series of lookup table (LUT)-based SR schemes that employ simple LUT readout and largely elude CNN computation. Nonetheless, the multi-megabyte LUTs in existing methods still prohibit on-chip storage and necessitate off-chip memory transport. This work tackles this storage hurdle and innovates hundred-kilobyte LUT (HKLUT) models amenable to on-chip cache. Utilizing an asymmetric two-branch multistage network coupled with a suite of specialized kernel patterns, HKLUT demonstrates an uncompromising performance and superior hardware efficiency over existing LUT schemes. Our implementation is publicly available at: https://github.com/jasonli0707/hklut.

LGAug 14, 2024
PolyCL: Contrastive Learning for Polymer Representation Learning via Explicit and Implicit Augmentations

Jiajun Zhou, Yijie Yang, Austin M. Mroz et al.

Polymers play a crucial role in a wide array of applications due to their diverse and tunable properties. Establishing the relationship between polymer representations and their properties is crucial to the computational design and screening of potential polymers via machine learning. The quality of the representation significantly influences the effectiveness of these computational methods. Here, we present a self-supervised contrastive learning paradigm, PolyCL, for learning high-quality polymer representation without the need for labels. Our model combines explicit and implicit augmentation strategies for improved learning performance. The results demonstrate that our model achieves either better, or highly competitive, performances on transfer learning tasks as a feature extractor without an overcomplicated training strategy or hyperparameter optimisation. Further enhancing the efficacy of our model, we conducted extensive analyses on various augmentation combinations used in contrastive learning. This led to identifying the most effective combination to maximise PolyCL's performance.

LGNov 14, 2023
Lite it fly: An All-Deformable-Butterfly Network

Rui Lin, Jason Chun Lok Li, Jiajun Zhou et al.

Most deep neural networks (DNNs) consist fundamentally of convolutional and/or fully connected layers, wherein the linear transform can be cast as the product between a filter matrix and a data matrix obtained by arranging feature tensors into columns. The lately proposed deformable butterfly (DeBut) decomposes the filter matrix into generalized, butterflylike factors, thus achieving network compression orthogonal to the traditional ways of pruning or low-rank decomposition. This work reveals an intimate link between DeBut and a systematic hierarchy of depthwise and pointwise convolutions, which explains the empirically good performance of DeBut layers. By developing an automated DeBut chain generator, we show for the first time the viability of homogenizing a DNN into all DeBut layers, thus achieving an extreme sparsity and compression. Various examples and hardware benchmarks verify the advantages of All-DeBut networks. In particular, we show it is possible to compress a PointNet to < 5% parameters with < 5% accuracy drop, a record not achievable by other compression schemes.

95.1MLMay 13
AIS: Adaptive Importance Sampling for Quantized RL

Jiajun Zhou, Wei Shao, Lingchao Zheng et al.

Reinforcement learning (RL) for large language models (LLMs) is dominated by the cost of rollout generation, which has motivated the use of low-precision rollouts (e.g., FP8) paired with a BF16 trainer to improve throughput and reduce memory pressure. This introduces a rollout-training mismatch that biases the policy gradient and can cause training to collapse outright on reasoning benchmarks. We show that the mismatch is non-stationary and acts as a double-edged sword: early in training it provides a stochastic exploration bonus, exposing the gradient to trajectories the trainer would otherwise under-sample, but the same perturbation transitions into a destabilizing source of bias as the policy concentrates. To solve this, we propose Adaptive Importance Sampling (AIS), a correction framework that adjusts the strength of its intervention on a per-batch basis. AIS combines three real-time diagnostics, namely weight reliability, divergence severity, and variance amplification, into a single mixing coefficient that interpolates between the uncorrected and fully importance-weighted gradients, suppressing the destabilizing component of the mismatch while preserving its exploratory benefit. We integrate AIS into GRPO and evaluate it on the diffusion-based LLaDA-8B-Instruct and the autoregressive Qwen3-8B and Qwen3.5-9B across mathematical reasoning and planning benchmarks. AIS matches the BF16 baseline on most tasks while retaining the 1.5 to 2.76x rollout speedup of FP8.

48.8LGMar 29
CrossHGL: A Text-Free Foundation Model for Cross-Domain Heterogeneous Graph Learning

Xuanze Chen, Jiajun Zhou, Yadong Li et al.

Heterogeneous graph representation learning (HGRL) is essential for modeling complex systems with diverse node and edge types. However, most existing methods are limited to closed-world settings with shared schemas and feature spaces, hindering cross-domain generalization. While recent graph foundation models improve transferability, they often target homogeneous graphs, rely on domain-specific schemas, or require rich textual attributes. Consequently, text-free and few-shot cross-domain HGRL remains underexplored. To address this, we propose CrossHGL, a foundation framework that preserves and transfers multi-relational structural semantics without external textual supervision. Specifically, a semantic-preserving transformation strategy homogenizes heterogeneous graphs while encoding interaction semantics into edge features. Based on this, a prompt-aware multi-domain pre-training framework with a Tri-Prompt mechanism captures transferable knowledge across feature, edge, and structure perspectives via self-supervised contrastive learning. For target-domain adaptation, we develop a parameter-efficient fine-tuning strategy that freezes the pre-trained backbone and performs few-shot classification via prompt composition and prototypical learning. Experiments on node-level and graph-level tasks show that CrossHGL consistently outperforms state-of-the-art baselines, yielding average relative improvements of 25.1% and 7.6% in Micro-F1 for node and graph classification, respectively, while remaining competitive in challenging feature-degenerated settings.

CLFeb 18, 2024
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

Yifan Yang, Jiajun Zhou, Ngai Wong et al.

Various parameter-efficient fine-tuning (PEFT) techniques have been proposed to enable computationally efficient fine-tuning while maintaining model performance. However, existing PEFT methods are still limited by the growing number of trainable parameters with the rapid deployment of Large Language Models (LLMs). To address this challenge, we present LoRETTA, an ultra-parameter-efficient framework that significantly reduces trainable parameters through tensor-train decomposition. Specifically, we propose two methods, named {LoRETTA}$_{adp}$ and {LoRETTA}$_{rep}$. The former employs tensorized adapters, offering a high-performance yet lightweight approach for the fine-tuning of LLMs. The latter emphasizes fine-tuning via weight parameterization with a set of small tensor factors. LoRETTA achieves comparable or better performance than most widely used PEFT methods with up to $100\times$ fewer parameters on the LLaMA-2-7B models. Furthermore, empirical results demonstrate that the proposed method effectively improves training efficiency, enjoys better multi-task learning performance, and enhances the anti-overfitting capability. Plug-and-play codes built upon the Huggingface framework and PEFT library will be released.

LGFeb 17, 2025
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models

Jiajun Zhou, Yifan Yang, Kai Zhen et al.

Language Models (LLMs) are often quantized to lower precision to reduce the memory cost and latency in inference. However, quantization often degrades model performance, thus fine-tuning is required for various down-stream tasks. Traditional fine-tuning methods such as stochastic gradient descent and Adam optimization require backpropagation, which are error-prone in the low-precision settings. To overcome these limitations, we propose the Quantized Zeroth-Order (QuZO) framework, specifically designed for fine-tuning LLMs through low-precision (e.g., 4- or 8-bit) forward passes. Our method can avoid the error-prone low-precision straight-through estimator, and utilizes optimized stochastic rounding to mitigate the increased bias. QuZO simplifies the training process, while achieving results comparable to first-order methods in ${\rm FP}8$ and superior accuracy in ${\rm INT}8$ and ${\rm INT}4$ training. Experiments demonstrate that low-bit training QuZO achieves performance comparable to MeZO optimization on GLUE, Multi-Choice, and Generation tasks, while reducing memory cost by $2.94 \times$ in LLaMA2-7B fine-tuning compared to quantized first-order methods.

LGFeb 12, 2025
Mixture of Message Passing Experts with Routing Entropy Regularization for Node Classification

Xuanze Chen, Jiajun Zhou, Yadong Li et al.

Graph neural networks (GNNs) have achieved significant progress in graph-based learning tasks, yet their performance often deteriorates when facing heterophilous structures where connected nodes differ substantially in features and labels. To address this limitation, we propose GNNMoE, a novel entropy-driven mixture of message-passing experts framework that enables node-level adaptive representation learning. GNNMoE decomposes message passing into propagation and transformation operations and integrates them through multiple expert networks guided by a hybrid routing mechanism. And a routing entropy regularization dynamically adjusts soft weighting and soft top-$k$ routing, allowing GNNMoE to flexibly adapt to diverse neighborhood contexts. Extensive experiments on twelve benchmark datasets demonstrate that GNNMoE consistently outperforms SOTA node classification methods, while maintaining scalability and interpretability. This work provides a unified and principled approach for achieving fine-grained, personalized node representation learning.

CRMar 24, 2024
EG-ConMix: An Intrusion Detection Method based on Graph Contrastive Learning

Lijin Wu, Shanshan Lei, Feilong Liao et al.

As the number of IoT devices increases, security concerns become more prominent. The impact of threats can be minimized by deploying Network Intrusion Detection System (NIDS) by monitoring network traffic, detecting and discovering intrusions, and issuing security alerts promptly. Most intrusion detection research in recent years has been directed towards the pair of traffic itself without considering the interrelationships among them, thus limiting the monitoring of complex IoT network attack events. Besides, anomalous traffic in real networks accounts for only a small fraction, which leads to a severe imbalance problem in the dataset that makes algorithmic learning and prediction extremely difficult. In this paper, we propose an EG-ConMix method based on E-GraphSAGE, incorporating a data augmentation module to fix the problem of data imbalance. In addition, we incorporate contrastive learning to discern the difference between normal and malicious traffic samples, facilitating the extraction of key features. Extensive experiments on two publicly available datasets demonstrate the superior intrusion detection performance of EG-ConMix compared to state-of-the-art methods. Remarkably, it exhibits significant advantages in terms of training speed and accuracy for large-scale graphs.

CRMar 27, 2025
Unveiling Latent Information in Transaction Hashes: Hypergraph Learning for Ethereum Ponzi Scheme Detection

Junhao Wu, Yixin Yang, Chengxiang Jin et al.

With the widespread adoption of Ethereum, financial frauds such as Ponzi schemes have become increasingly rampant in the blockchain ecosystem, posing significant threats to the security of account assets. Existing Ethereum fraud detection methods typically model account transactions as graphs, but this approach primarily focuses on binary transactional relationships between accounts, failing to adequately capture the complex multi-party interaction patterns inherent in Ethereum. To address this, we propose a hypergraph modeling method for the Ponzi scheme detection method in Ethereum, called HyperDet. Specifically, we treat transaction hashes as hyperedges that connect all the relevant accounts involved in a transaction. Additionally, we design a two-step hypergraph sampling strategy to significantly reduce computational complexity. Furthermore, we introduce a dual-channel detection module, including the hypergraph detection channel and the hyper-homo graph detection channel, to be compatible with existing detection methods. Experimental results show that, compared to traditional homogeneous graph-based methods, the hyper-homo graph detection channel achieves significant performance improvements, demonstrating the superiority of hypergraph in Ponzi scheme detection. This research offers innovations for modeling complex relationships in blockchain data.

CRNov 15, 2024
Lateral Movement Detection via Time-aware Subgraph Classification on Authentication Logs

Jiajun Zhou, Jiacheng Yao, Xuanze Chen et al.

Lateral movement is a crucial component of advanced persistent threat (APT) attacks in networks. Attackers exploit security vulnerabilities in internal networks or IoT devices, expanding their control after initial infiltration to steal sensitive data or carry out other malicious activities, posing a serious threat to system security. Existing research suggests that attackers generally employ seemingly unrelated operations to mask their malicious intentions, thereby evading existing lateral movement detection methods and hiding their intrusion traces. In this regard, we analyze host authentication log data from a graph perspective and propose a multi-scale lateral movement detection framework called LMDetect. The main workflow of this framework proceeds as follows: 1) Construct a heterogeneous multigraph from host authentication log data to strengthen the correlations among internal system entities; 2) Design a time-aware subgraph generator to extract subgraphs centered on authentication events from the heterogeneous authentication multigraph; 3) Design a multi-scale attention encoder that leverages both local and global attention to capture hidden anomalous behavior patterns in the authentication subgraphs, thereby achieving lateral movement detection. Extensive experiments on two real-world authentication log datasets demonstrate the effectiveness and superiority of our framework in detecting lateral movement behaviors.

LGOct 15, 2024
Rethinking Graph Transformer Architecture Design for Node Classification

Jiajun Zhou, Xuanze Chen, Chenxuan Xie et al.

Graph Transformer (GT), as a special type of Graph Neural Networks (GNNs), utilizes multi-head attention to facilitate high-order message passing. However, this also imposes several limitations in node classification applications: 1) nodes are susceptible to global noise; 2) self-attention computation cannot scale well to large graphs. In this work, we conduct extensive observational experiments to explore the adaptability of the GT architecture in node classification tasks and draw several conclusions: the current multi-head self-attention module in GT can be completely replaceable, while the feed-forward neural network module proves to be valuable. Based on this, we decouple the propagation (P) and transformation (T) of GNNs and explore a powerful GT architecture, named GNNFormer, which is based on the P/T combination message passing and adapted for node classification in both homophilous and heterophilous scenarios. Extensive experiments on 12 benchmark datasets demonstrate that our proposed GT architecture can effectively adapt to node classification tasks without being affected by global noise and computational efficiency limitations.

CRApr 1, 2025
Hierarchical Local-Global Feature Learning for Few-shot Malicious Traffic Detection

Songtao Peng, Lei Wang, Wu Shuai et al.

With the rapid growth of internet traffic, malicious network attacks have become increasingly frequent and sophisticated, posing significant threats to global cybersecurity. Traditional detection methods, including rule-based and machine learning-based approaches, struggle to accurately identify emerging threats, particularly in scenarios with limited samples. While recent advances in few-shot learning have partially addressed the data scarcity issue, existing methods still exhibit high false positive rates and lack the capability to effectively capture crucial local traffic patterns. In this paper, we propose HLoG, a novel hierarchical few-shot malicious traffic detection framework that leverages both local and global features extracted from network sessions. HLoG employs a sliding-window approach to segment sessions into phases, capturing fine-grained local interaction patterns through hierarchical bidirectional GRU encoding, while simultaneously modeling global contextual dependencies. We further design a session similarity assessment module that integrates local similarity with global self-attention-enhanced representations, achieving accurate and robust few-shot traffic classification. Comprehensive experiments on three meticulously reconstructed datasets demonstrate that HLoG significantly outperforms existing state-of-the-art methods. Particularly, HLoG achieves superior recall rates while substantially reducing false positives, highlighting its effectiveness and practical value in real-world cybersecurity applications.

LGDec 11, 2024
Mixture of Experts Meets Decoupled Message Passing: Towards General and Adaptive Node Classification

Xuanze Chen, Jiajun Zhou, Shanqing Yu et al.

Graph neural networks excel at graph representation learning but struggle with heterophilous data and long-range dependencies. And graph transformers address these issues through self-attention, yet face scalability and noise challenges on large-scale graphs. To overcome these limitations, we propose GNNMoE, a universal model architecture for node classification. This architecture flexibly combines fine-grained message-passing operations with a mixture-of-experts mechanism to build feature encoding blocks. Furthermore, by incorporating soft and hard gating layers to assign the most suitable expert networks to each node, we enhance the model's expressive power and adaptability to different graph types. In addition, we introduce adaptive residual connections and an enhanced FFN module into GNNMoE, further improving the expressiveness of node representation. Extensive experimental results demonstrate that GNNMoE performs exceptionally well across various types of graph data, effectively alleviating the over-smoothing issue and global noise, enhancing model robustness and adaptability, while also ensuring computational efficiency on large-scale graphs.

CRJun 29, 2024
Dual-view Aware Smart Contract Vulnerability Detection for Ethereum

Jiacheng Yao, Maolin Wang, Wanqi Chen et al.

The wide application of Ethereum technology has brought technological innovation to traditional industries. As one of Ethereum's core applications, smart contracts utilize diverse contract codes to meet various functional needs and have gained widespread use. However, the non-tamperability of smart contracts, coupled with vulnerabilities caused by natural flaws or human errors, has brought unprecedented challenges to blockchain security. Therefore, in order to ensure the healthy development of blockchain technology and the stability of the blockchain community, it is particularly important to study the vulnerability detection techniques for smart contracts. In this paper, we propose a Dual-view Aware Smart Contract Vulnerability Detection Framework named DVDet. The framework initially converts the source code and bytecode of smart contracts into weighted graphs and control flow sequences, capturing potential risk features from these two perspectives and integrating them for analysis, ultimately achieving effective contract vulnerability detection. Comprehensive experiments on the Ethereum dataset show that our method outperforms others in detecting vulnerabilities.

LGMar 24, 2024
A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures

Hao Song, Jiacheng Yao, Zhengxi Li et al.

Over the past few years, federated learning has become widely used in various classical machine learning fields because of its collaborative ability to train data from multiple sources without compromising privacy. However, in the area of graph neural networks, the nodes and network structures of graphs held by clients are different in many practical applications, and the aggregation method that directly shares model gradients cannot be directly applied to this scenario. Therefore, this work proposes a federated aggregation method FLGNN applied to various graph federation scenarios and investigates the aggregation effect of parameter sharing at each layer of the graph neural network model. The effectiveness of the federated aggregation method FLGNN is verified by experiments on real datasets. Additionally, for the privacy security of FLGNN, this paper designs membership inference attack experiments and differential privacy defense experiments. The results show that FLGNN performs good robustness, and the success rate of privacy theft is further reduced by adding differential privacy defense methods.

CVDec 15, 2023
A Unifying Tensor View for Lightweight CNNs

Jason Chun Lok Li, Rui Lin, Jiajun Zhou et al.

Despite the decomposition of convolutional kernels for lightweight CNNs being well studied, existing works that rely on tensor network diagrams or hyperdimensional abstraction lack geometry intuition. This work devises a new perspective by linking a 3D-reshaped kernel tensor to its various slice-wise and rank-1 decompositions, permitting a straightforward connection between various tensor approximations and efficient CNN modules. Specifically, it is discovered that a pointwise-depthwise-pointwise (PDP) configuration constitutes a viable construct for lightweight CNNs. Moreover, a novel link to the latest ShiftNet is established, inspiring a first-ever shift layer pruning that achieves nearly 50% compression with < 1% drop in accuracy for ShiftResNet.

CRDec 7, 2021
BlockGC: A Joint Learning Framework for Account Identity Inference on Blockchain with Graph Contrast

Jiajun Zhou, Chenkai Hu, Shenbo Gong et al.

Blockchain technology has the characteristics of decentralization, traceability and tamper proof, which creates a reliable decentralized transaction mode, further accelerating the development of the blockchain platforms. However, with the popularization of various financial applications, security problems caused by blockchain digital assets, such as money laundering, illegal fundraising and phishing fraud, are constantly on the rise. Therefore, financial security has become an important issue in the blockchain ecosystem, and identifying the types of accounts in blockchain (e.g. miners, phishing accounts, Ponzi contracts, etc.) is of great significance in risk assessment and market supervision. In this paper, we construct an account interaction graph using raw blockchain data in a graph perspective, and proposes a joint learning framework for account identity inference on blockchain with graph contrast. We first capture transaction feature and correlation feature from interaction graph, and then perform sampling and data augmentation to generate multiple views for account subgraphs, finally jointly train the subgraph contrast and account classification task. Extensive experiments on Ethereum datasets show that our method achieves significant advantages in account identity inference task in terms of classification performance, scalability and generalization.

SIApr 14, 2021
Identity Inference on Blockchain using Graph Neural Network

Jie Shen, Jiajun Zhou, Yunyi Xie et al.

The anonymity of blockchain has accelerated the growth of illegal activities and criminal behaviors on cryptocurrency platforms. Although decentralization is one of the typical characteristics of blockchain, we urgently call for effective regulation to detect these illegal behaviors to ensure the safety and stability of user transactions. Identity inference, which aims to make a preliminary inference about account identity, plays a significant role in blockchain security. As a common tool, graph mining technique can effectively represent the interactive information between accounts and be used for identity inference. However, existing methods cannot balance scalability and end-to-end architecture, resulting high computational consumption and weak feature representation. In this paper, we present a novel approach to analyze user's behavior from the perspective of the transaction subgraph, which naturally transforms the identity inference task into a graph classification pattern and effectively avoids computation in large-scale graph. Furthermore, we propose a generic end-to-end graph neural network model, named $\text{I}^2 \text{BGNN}$, which can accept subgraph as input and learn a function mapping the transaction subgraph pattern to account identity, achieving de-anonymization. Extensive experiments on EOSG and ETHG datasets demonstrate that the proposed method achieve the state-of-the-art performance in identity inference.

LGJul 11, 2020
M-Evolve: Structural-Mapping-Based Data Augmentation for Graph Classification

Jiajun Zhou, Jie Shen, Shanqing Yu et al.

Graph classification, which aims to identify the category labels of graphs, plays a significant role in drug classification, toxicity detection, protein analysis etc. However, the limitation of scale in the benchmark datasets makes it easy for graph classification models to fall into over-fitting and undergeneralization. To improve this, we introduce data augmentation on graphs (i.e. graph augmentation) and present four methods:random mapping, vertex-similarity mapping, motif-random mapping and motif-similarity mapping, to generate more weakly labeled data for small-scale benchmark datasets via heuristic transformation of graph structures. Furthermore, we propose a generic model evolution framework, named M-Evolve, which combines graph augmentation, data filtration and model retraining to optimize pre-trained graph classifiers. Experiments on six benchmark datasets demonstrate that the proposed framework helps existing graph classification models alleviate over-fitting and undergeneralization in the training on small-scale benchmark datasets, which successfully yields an average improvement of 3 - 13% accuracy on graph classification tasks.