Zhichao Han

LG
h-index23
21papers
3,966citations
Novelty50%
AI Score53

21 Papers

LGAug 31, 2023Code
BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks

Qiang Huang, Jiawei Jiang, Xi Susie Rao et al. · eth-zurich

To handle graphs in which features or connectivities are evolving over time, a series of temporal graph neural networks (TGNNs) have been proposed. Despite the success of these TGNNs, the previous TGNN evaluations reveal several limitations regarding four critical issues: 1) inconsistent datasets, 2) inconsistent evaluation pipelines, 3) lacking workload diversity, and 4) lacking efficient comparison. Overall, there lacks an empirical study that puts TGNN models onto the same ground and compares them comprehensively. To this end, we propose BenchTemp, a general benchmark for evaluating TGNN models on various workloads. BenchTemp provides a set of benchmark datasets so that different TGNN models can be fairly compared. Further, BenchTemp engineers a standard pipeline that unifies the TGNN evaluation. With BenchTemp, we extensively compare the representative TGNN models on different tasks (e.g., link prediction and node classification) and settings (transductive and inductive), w.r.t. both effectiveness and efficiency metrics. We have made BenchTemp publicly available at https://github.com/qianghuangwhu/benchtemp.

LGMay 25, 2022
BRIGHT -- Graph Neural Networks in Real-Time Fraud Detection

Mingxuan Lu, Zhichao Han, Susie Xi Rao et al. · eth-zurich

Detecting fraudulent transactions is an essential component to control risk in e-commerce marketplaces. Apart from rule-based and machine learning filters that are already deployed in production, we want to enable efficient real-time inference with graph neural networks (GNNs), which is useful to catch multihop risk propagation in a transaction graph. However, two challenges arise in the implementation of GNNs in production. First, future information in a dynamic graph should not be considered in message passing to predict the past. Second, the latency of graph query and GNN model inference is usually up to hundreds of milliseconds, which is costly for some critical online services. To tackle these challenges, we propose a Batch and Real-time Inception GrapH Topology (BRIGHT) framework to conduct an end-to-end GNN learning that allows efficient online real-time inference. BRIGHT framework consists of a graph transformation module (Two-Stage Directed Graph) and a corresponding GNN architecture (Lambda Neural Network). The Two-Stage Directed Graph guarantees that the information passed through neighbors is only from the historical payment transactions. It consists of two subgraphs representing historical relationships and real-time links, respectively. The Lambda Neural Network decouples inference into two stages: batch inference of entity embeddings and real-time inference of transaction prediction. Our experiments show that BRIGHT outperforms the baseline models by >2\% in average w.r.t.~precision. Furthermore, BRIGHT is computationally efficient for real-time fraud detection. Regarding end-to-end performance (including neighbor query and inference), BRIGHT can reduce the P99 latency by >75\%. For the inference stage, our speedup is on average 7.8$\times$ compared to the traditional GNN.

66.0LGMay 29
Learning Permutation-invariant Macroscopic Dynamics

Zhichao Han, Mengyi Chen, Qianxiao Li

Accurately modeling the macroscopic dynamics of high-dimensional microscopic systems is of broad interest across the sciences. Many data-driven approaches learn a low-dimensional latent state through an autoencoder trained for pointwise input reconstruction. These methods typically assume a fixed ordering of microscopic degrees of freedom in the input. However, in many settings, such as particle systems, the microscopic state is inherently unordered. This motivates an autoencoder framework that learns permutation-invariant latent representations. To this end, we adopt a permutation-invariant encoder and design the decoder to reconstruct the mass distribution centered at the observed points rather than per-sample reconstruction. We then jointly learn the macroscopic dynamics of the observables together with the latent states. We demonstrate the effectiveness and robustness of the proposed method across a range of microscopic settings, including learning the energy dynamics in interacting particle systems, predicting mixing dynamics in Lennard-Jones fluids, and modeling the stretching dynamics from video data of polymers moving in an elongational force field.

LGApr 22, 2022
Modelling graph dynamics in fraud detection with "Attention"

Susie Xi Rao, Clémence Lanfranchi, Shuai Zhang et al. · amazon-science, eth-zurich

At online retail platforms, detecting fraudulent accounts and transactions is crucial to improve customer experience, minimize loss, and avoid unauthorized transactions. Despite the variety of different models for deep learning on graphs, few approaches have been proposed for dealing with graphs that are both heterogeneous and dynamic. In this paper, we propose DyHGN (Dynamic Heterogeneous Graph Neural Network) and its variants to capture both temporal and heterogeneous information. We first construct dynamic heterogeneous graphs from registration and transaction data from eBay. Then, we build models with diachronic entity embedding and heterogeneous graph transformer. We also use model explainability techniques to understand the behaviors of DyHGN-* models. Our findings reveal that modelling graph dynamics with heterogeneous inputs need to be conducted with "attention" depending on the data structure, distribution, and computation cost.

LGSep 9, 2024Code
Retrofitting Temporal Graph Neural Networks with Transformer

Qiang Huang, Xiao Yan, Xin Wang et al.

Temporal graph neural networks (TGNNs) outperform regular GNNs by incorporating time information into graph-based operations. However, TGNNs adopt specialized models (e.g., TGN, TGAT, and APAN ) and require tailored training frameworks (e.g., TGL and ETC). In this paper, we propose TF-TGN, which uses Transformer decoder as the backbone model for TGNN to enjoy Transformer's codebase for efficient training. In particular, Transformer achieves tremendous success for language modeling, and thus the community developed high-performance kernels (e.g., flash-attention and memory-efficient attention) and efficient distributed training schemes (e.g., PyTorch FSDP, DeepSpeed, and Megatron-LM). We observe that TGNN resembles language modeling, i.e., the message aggregation operation between chronologically occurring nodes and their temporal neighbors in TGNNs can be structured as sequence modeling. Beside this similarity, we also incorporate a series of algorithm designs including suffix infilling, temporal graph attention with self-loop, and causal masking self-attention to make TF-TGN work. During training, existing systems are slow in transforming the graph topology and conducting graph sampling. As such, we propose methods to parallelize the CSR format conversion and graph sampling. We also adapt Transformer codebase to train TF-TGN efficiently with multiple GPUs. We experiment with 9 graphs and compare with 2 state-of-the-art TGNN training frameworks. The results show that TF-TGN can accelerate training by over 2.20 while providing comparable or even superior accuracy to existing SOTA TGNNs. TF-TGN is available at https://github.com/qianghuangwhu/TF-TGN.

LGApr 30, 2023
Collective Relational Inference for learning heterogeneous interactions

Zhichao Han, Olga Fink, David S. Kammer

Interacting systems are ubiquitous in nature and engineering, ranging from particle dynamics in physics to functionally connected brain regions. These interacting systems can be modeled by graphs where edges correspond to the interactions between interactive entities. Revealing interaction laws is of fundamental importance but also particularly challenging due to underlying configurational complexities. The associated challenges become exacerbated for heterogeneous systems that are prevalent in reality, where multiple interaction types coexist simultaneously and relational inference is required. Here, we propose a novel probabilistic method for relational inference, which possesses two distinctive characteristics compared to existing methods. First, it infers the interaction types of different edges collectively by explicitly encoding the correlation among incoming interactions with a joint distribution, and second, it allows handling systems with variable topological structure over time. We evaluate the proposed methodology across several benchmark datasets and demonstrate that it outperforms existing methods in accurately inferring interaction types. We further show that when combined with known constraints, it allows us, for example, to discover physics-consistent interaction laws of particle systems. Overall the proposed model is data-efficient and generalizable to large systems when trained on smaller ones. The developed methodology constitutes a key element for understanding interacting systems and may find application in graph structure learning.

MTRL-SCIJul 25, 2024
Learning Physics-Consistent Material Behavior from Dynamic Displacements

Zhichao Han, Mohit Pundir, Olga Fink et al.

Accurately modeling the mechanical behavior of materials is crucial for numerous engineering applications. The quality of these models depends directly on the accuracy of the constitutive law that defines the stress-strain relation. However, discovering these constitutive material laws remains a significant challenge, in particular when only material deformation data is available. To address this challenge, unsupervised machine learning methods have been proposed to learn the constitutive law from deformation data. Nonetheless, existing approaches have several limitations: they either fail to ensure that the learned constitutive relations are consistent with physical principles, or they rely on boundary force data for training which are unavailable in many in-situ scenarios. Here, we introduce a machine learning approach to learn physics-consistent constitutive relations solely from material deformation without boundary force information. This is achieved by considering a dynamic formulation rather than static equilibrium data and applying an input convex neural network (ICNN). We validate the effectiveness of the proposed method on a diverse range of hyperelastic material laws. We demonstrate that it is robust to a significant level of noise and that it converges to the ground truth with increasing data resolution. We also show that the model can be effectively trained using a displacement field from a subdomain of the test specimen and that the learned constitutive relation from one material sample is transferable to other samples with different geometries. The developed methodology provides an effective tool for discovering constitutive relations. It is, due to its design based on dynamics, particularly suited for applications to strain-rate-dependent materials and situations where constitutive laws need to be inferred from in-situ measurements without access to global force data.

LGApr 2, 2025Code
MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage

Yongjun He, Roger Waleffe, Zhichao Han et al.

Many modern machine learning (ML) methods rely on embedding models to learn vector representations (embeddings) for a set of entities (embedding tables). As increasingly diverse ML applications utilize embedding models and embedding tables continue to grow in size and number, there has been a surge in the ad-hoc development of specialized frameworks targeted to train large embedding models for specific tasks. Although the scalability issues that arise in different embedding model training tasks are similar, each of these frameworks independently reinvents and customizes storage components for specific tasks, leading to substantial duplicated engineering efforts in both development and deployment. This paper presents MLKV, an efficient, extensible, and reusable data storage framework designed to address the scalability challenges in embedding model training, specifically data stall and staleness. MLKV augments disk-based key-value storage by democratizing optimizations that were previously exclusive to individual specialized frameworks and provides easy-to-use interfaces for embedding model training tasks. Extensive experiments on open-source workloads, as well as applications in eBay's payment transaction risk detection and seller payment risk detection, show that MLKV outperforms offloading strategies built on top of industrial-strength key-value stores by 1.6-12.6x. MLKV is open-source at https://github.com/llm-db/MLKV.

ROMay 21, 2021Code
Fast-Racing: An Open-source Strong Baseline for SE(3) Planning in Autonomous Drone Racing

Zhichao Han, Zhepei Wang, Neng Pan et al.

With the autonomy of aerial robots advances in recent years, autonomous drone racing has drawn increasing attention. In a professional pilot competition, a skilled operator always controls the drone to agilely avoid obstacles in aggressive attitudes, for reaching the destination as fast as possible. Autonomous flight like elite pilots requires planning in SE(3), whose non-triviality and complexity hindering a convincing solution in our community by now. To bridge this gap, this paper proposes an open-source baseline, which includes a high-performance SE(3) planner and a challenging simulation platform tailored for drone racing. We specify the SE(3) trajectory generation as a soft-penalty optimization problem, and speed up the solving process utilizing its underlying parallel structure. Moreover, to provide a testbed for challenging the planner, we develop delicate drone racing tracks which mimic real-world set-up and necessities planning in SE(3). Besides, we provide necessary system components such as common map interfaces and a baseline controller, to make our work plug-in-and-use. With our baseline, we hope to future foster the research of SE(3) planning and the competition of autonomous drone racing.

73.1ROApr 7
Precise Aggressive Aerial Maneuvers with Sensorimotor Policies

Tianyue Wu, Guangtong Xu, Zihan Wang et al.

Precise aggressive maneuvers with lightweight onboard sensors remains a key bottleneck in fully exploiting the maneuverability of drones. Such maneuvers are critical for expanding the systems' accessible area by navigating through narrow openings in the environment. Among the most relevant problems, a representative one is aggressive traversal through narrow gaps with quadrotors under SE(3) constraints, which require the quadrotors to leverage a momentary tilted attitude and the asymmetry of the airframe to navigate through gaps. In this paper, we achieve such maneuvers by developing sensorimotor policies directly mapping onboard vision and proprioception into low-level control commands. The policies are trained using reinforcement learning (RL) with end-to-end policy distillation in simulation. We mitigate the fundamental hardness of model-free RL's exploration on the restricted solution space with an initialization strategy leveraging trajectories generated by a model-based planner. Careful sim-to-real design allows the policy to control a quadrotor through narrow gaps with low clearances and high repeatability. For instance, the proposed method enables a quadrotor to navigate a rectangular gap at a 5 cm clearance, tilted at up to 90-degree orientation, without knowledge of the gap's position or orientation. Without training on dynamic gaps, the policy can reactively servo the quadrotor to traverse through a moving gap. The proposed method is also validated by training and deploying policies on challenging tracks of narrow gaps placed closely. The flexibility of the policy learning method is demonstrated by developing policies for geometrically diverse gaps, without relying on manually defined traversal poses and visual features.

RODec 17, 2025
VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments

Yuze Wu, Mo Zhu, Xingxing Li et al.

This paper proposes VLA-AN, an efficient and onboard Vision-Language-Action (VLA) framework dedicated to autonomous drone navigation in complex environments. VLA-AN addresses four major limitations of existing large aerial navigation models: the data domain gap, insufficient temporal navigation with reasoning, safety issues with generative action policies, and onboard deployment constraints. First, we construct a high-fidelity dataset utilizing 3D Gaussian Splatting (3D-GS) to effectively bridge the domain gap. Second, we introduce a progressive three-stage training framework that sequentially reinforces scene comprehension, core flight skills, and complex navigation capabilities. Third, we design a lightweight, real-time action module coupled with geometric safety correction. This module ensures fast, collision-free, and stable command generation, mitigating the safety risks inherent in stochastic generative policies. Finally, through deep optimization of the onboard deployment pipeline, VLA-AN achieves a robust real-time 8.3x improvement in inference throughput on resource-constrained UAVs. Extensive experiments demonstrate that VLA-AN significantly improves spatial grounding, scene reasoning, and long-horizon navigation, achieving a maximum single-task success rate of 98.1%, and providing an efficient, practical solution for realizing full-chain closed-loop autonomy in lightweight aerial robots.

ROMar 2, 2025
FLOAT Drone: A Fully-actuated Coaxial Aerial Robot for Close-Proximity Operations

Junxiao Lin, Shuhang Ji, Yuze Wu et al.

How to endow aerial robots with the ability to operate in close proximity remains an open problem. The core challenges lie in the propulsion system's dual-task requirement: generating manipulation forces while simultaneously counteracting gravity. These competing demands create dynamic coupling effects during physical interactions. Furthermore, rotor-induced airflow disturbances critically undermine operational reliability. Although fully-actuated unmanned aerial vehicles (UAVs) alleviate dynamic coupling effects via six-degree-of-freedom (6-DoF) force-torque decoupling, existing implementations fail to address the aerodynamic interference between drones and environments. They also suffer from oversized designs, which compromise maneuverability and limit their applications in various operational scenarios. To address these limitations, we present FLOAT Drone (FuLly-actuated cOaxial Aerial roboT), a novel fully-actuated UAV featuring two key structural innovations. By integrating control surfaces into fully-actuated systems for the first time, we significantly suppress lateral airflow disturbances during operations. Furthermore, a coaxial dual-rotor configuration enables a compact size while maintaining high hovering efficiency. Through dynamic modeling, we have developed hierarchical position and attitude controllers that support both fully-actuated and underactuated modes. Experimental validation through comprehensive real-world experiments confirms the system's functional capabilities in close-proximity operations.

LGFeb 1, 2022
Learning Physics-Consistent Particle Interactions

Zhichao Han, David S. Kammer, Olga Fink

Interacting particle systems play a key role in science and engineering. Access to the governing particle interaction law is fundamental for a complete understanding of such systems. However, the inherent system complexity keeps the particle interaction hidden in many cases. Machine learning methods have the potential to learn the behavior of interacting particle systems by combining experiments with data analysis methods. However, most existing algorithms focus on learning the kinetics at the particle level. Learning pairwise interaction, e.g., pairwise force or pairwise potential energy, remains an open challenge. Here, we propose an algorithm that adapts the Graph Networks framework, which contains an edge part to learn the pairwise interaction and a node part to model the dynamics at particle level. Different from existing approaches that use neural networks in both parts, we design a deterministic operator in the node part that allows to precisely infer the pairwise interactions that are consistent with underlying physical laws by only being trained to predict the particle acceleration. We test the proposed methodology on multiple datasets and demonstrate that it achieves superior performance in inferring correctly the pairwise interactions while also being consistent with the underlying physics on all the datasets. The proposed framework is scalable to larger systems and transferable to any type of particle interactions, contrary to the previously proposed purely data-driven solutions. The developed methodology can support a better understanding and discovery of the underlying particle interaction laws, and hence guide the design of materials with targeted properties.

LGOct 9, 2021
Graph Neural Networks in Real-Time Fraud Detection with Lambda Architecture

Mingxuan Lu, Zhichao Han, Zitao Zhang et al.

Transaction checkout fraud detection is an essential risk control components for E-commerce marketplaces. In order to leverage graph networks to decrease fraud rate efficiently and guarantee the information flow passed through neighbors only from the past of the checkouts, we first present a novel Directed Dynamic Snapshot (DDS) linkage design for graph construction and a Lambda Neural Networks (LNN) architecture for effective inference with Graph Neural Networks embeddings. Experiments show that our LNN on DDS graph, outperforms baseline models significantly and is computational efficient for real-time fraud detection.

LGDec 20, 2020
Suspicious Massive Registration Detection via Dynamic Heterogeneous Graph Neural Networks

Susie Xi Rao, Shuai Zhang, Zhichao Han et al.

Massive account registration has raised concerns on risk management in e-commerce companies, especially when registration increases rapidly within a short time frame. To monitor these registrations constantly and minimize the potential loss they might incur, detecting massive registration and predicting their riskiness are necessary. In this paper, we propose a Dynamic Heterogeneous Graph Neural Network framework to capture suspicious massive registrations (DHGReg). We first construct a dynamic heterogeneous graph from the registration data, which is composed of a structural subgraph and a temporal subgraph. Then, we design an efficient architecture to predict suspicious/benign accounts. Our proposed model outperforms the baseline models and is computationally efficient in processing a dynamic heterogeneous graph constructed from a real-world dataset. In practice, the DHGReg framework would benefit the detection of suspicious registration behaviors at an early stage.

LGNov 24, 2020
xFraud: Explainable Fraud Transaction Detection

Susie Xi Rao, Shuai Zhang, Zhichao Han et al.

At online retail platforms, it is crucial to actively detect the risks of transactions to improve customer experience and minimize financial loss. In this work, we propose xFraud, an explainable fraud transaction prediction framework which is mainly composed of a detector and an explainer. The xFraud detector can effectively and efficiently predict the legitimacy of incoming transactions. Specifically, it utilizes a heterogeneous graph neural network to learn expressive representations from the informative heterogeneously typed entities in the transaction logs. The explainer in xFraud can generate meaningful and human-understandable explanations from graphs to facilitate further processes in the business unit. In our experiments with xFraud on real transaction networks with up to 1.1 billion nodes and 3.7 billion edges, xFraud is able to outperform various baseline models in many evaluation metrics while remaining scalable in distributed settings. In addition, we show that xFraud explainer can generate reasonable explanations to significantly assist the business analysis via both quantitative and qualitative evaluations.

RONov 8, 2020
Fast-Tracker: A Robust Aerial System for Tracking Agile Target in Cluttered Environments

Zhichao Han, Ruibin Zhang, Neng Pan et al.

This paper proposes a systematic solution that uses an unmanned aerial vehicle (UAV) to aggressively and safely track an agile target. The solution properly handles the challenging situations where the intent of the target and the dense environments are unknown to the UAV. Our work is divided into two parts: target motion prediction and tracking trajectory planning. The target motion prediction method utilizes target observations to reliably predict the future motion of the target considering dynamic constraints. The tracking trajectory planner follows the traditional hierarchical workflow.A target informed kinodynamic searching method is adopted as the front-end, which heuristically searches for a safe tracking trajectory. The back-end optimizer then refines it into a spatial-temporal optimal and collision-free trajectory. The proposed solution is integrated into an onboard quadrotor system. We fully test the system in challenging real-world tracking missions.Moreover, benchmark comparisons validate that the proposed method surpasses the cutting-edge methods on time efficiency and tracking effectiveness.

SIJan 22, 2020
Adversarial Attack on Community Detection by Hiding Individuals

Jia Li, Honglei Zhang, Zhichao Han et al.

It has been demonstrated that adversarial graphs, i.e., graphs with imperceptible perturbations added, can cause deep graph models to fail on node/graph classification tasks. In this paper, we extend adversarial graphs to the problem of community detection which is much more difficult. We focus on black-box attack and aim to hide targeted individuals from the detection of deep graph community detection models, which has many applications in real-world scenarios, for example, protecting personal privacy in social networks and understanding camouflage patterns in transaction networks. We propose an iterative learning framework that takes turns to update two modules: one working as the constrained graph generator and the other as the surrogate community detection model. We also find that the adversarial graphs generated by our method can be transferred to other learning based community detection models.

LGOct 10, 2019
DeGNN: Characterizing and Improving Graph Neural Networks with Graph Decomposition

Xupeng Miao, Nezihe Merve Gürel, Wentao Zhang et al.

Despite the wide application of Graph Convolutional Network (GCN), one major limitation is that it does not benefit from the increasing depth and suffers from the oversmoothing problem. In this work, we first characterize this phenomenon from the information-theoretic perspective and show that under certain conditions, the mutual information between the output after $l$ layers and the input of GCN converges to 0 exponentially with respect to $l$. We also show that, on the other hand, graph decomposition can potentially weaken the condition of such convergence rate, which enabled our analysis for GraphCNN. While different graph structures can only benefit from the corresponding decomposition, in practice, we propose an automatic connectivity-aware graph decomposition algorithm, DeGNN, to improve the performance of general graph neural networks. Extensive experiments on widely adopted benchmark datasets demonstrate that DeGNN can not only significantly boost the performance of corresponding GNNs, but also achieves the state-of-the-art performances.

LGMay 10, 2019
Predicting Path Failure In Time-Evolving Graphs

Jia Li, Zhichao Han, Hong Cheng et al.

In this paper we use a time-evolving graph which consists of a sequence of graph snapshots over time to model many real-world networks. We study the path classification problem in a time-evolving graph, which has many applications in real-world scenarios, for example, predicting path failure in a telecommunication network and predicting path congestion in a traffic network in the near future. In order to capture the temporal dependency and graph structure dynamics, we design a novel deep neural network named Long Short-Term Memory R-GCN (LRGCN). LRGCN considers temporal dependency between time-adjacent graph snapshots as a special relation with memory, and uses relational GCN to jointly process both intra-time and inter-time relations. We also propose a new path representation method named self-attentive path embedding (SAPE), to embed paths of arbitrary length into fixed-length vectors. Through experiments on a real-world telecommunication network and a traffic network in California, we demonstrate the superiority of LRGCN to other competing methods in path failure prediction, and prove the effectiveness of SAPE on path representation.

LGJan 22, 2018
Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning

Qimai Li, Zhichao Han, Xiao-Ming Wu

Many interesting problems in machine learning are being revisited with new deep learning tools. For graph-based semisupervised learning, a recent important development is graph convolutional networks (GCNs), which nicely integrate local vertex features and graph topology in the convolutional layers. Although the GCN model compares favorably with other state-of-the-art methods, its mechanisms are not clear and it still requires a considerable amount of labeled data for validation and model selection. In this paper, we develop deeper insights into the GCN model and address its fundamental limits. First, we show that the graph convolution of the GCN model is actually a special form of Laplacian smoothing, which is the key reason why GCNs work, but it also brings potential concerns of over-smoothing with many convolutional layers. Second, to overcome the limits of the GCN model with shallow architectures, we propose both co-training and self-training approaches to train GCNs. Our approaches significantly improve GCNs in learning with very few labels, and exempt them from requiring additional labels for validation. Extensive experiments on benchmarks have verified our theory and proposals.