Xiaoge Zhang

LG
h-index20
23papers
774citations
Novelty43%
AI Score51

23 Papers

LGAug 27, 2022Code
A Comprehensive Review of Digital Twin -- Part 2: Roles of Uncertainty Quantification and Optimization, a Battery Digital Twin, and Perspectives

Adam Thelen, Xiaoge Zhang, Olga Fink et al.

As an emerging technology in the era of Industry 4.0, digital twin is gaining unprecedented attention because of its promise to further optimize process design, quality control, health monitoring, decision and policy making, and more, by comprehensively modeling the physical world as a group of interconnected digital models. In a two-part series of papers, we examine the fundamental role of different modeling techniques, twinning enabling technologies, and uncertainty quantification and optimization methods commonly used in digital twins. This second paper presents a literature review of key enabling technologies of digital twins, with an emphasis on uncertainty quantification, optimization methods, open source datasets and tools, major findings, challenges, and future directions. Discussions focus on current methods of uncertainty quantification and optimization and how they are applied in different dimensions of a digital twin. Additionally, this paper presents a case study where a battery digital twin is constructed and tested to illustrate some of the modeling and twinning methods reviewed in this two-part review. Code and preprocessed data for generating all the results and figures presented in the case study are available on GitHub.

LGJul 31, 2023Code
BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA Acceleration

Jing-Xiao Liao, Sheng-Lai Wei, Chen-Long Xie et al.

Deep learning has achieved remarkable success in the field of bearing fault diagnosis. However, this success comes with larger models and more complex computations, which cannot be transferred into industrial fields requiring models to be of high speed, strong portability, and low power consumption. In this paper, we propose a lightweight and deployable model for bearing fault diagnosis, referred to as BearingPGA-Net, to address these challenges. Firstly, aided by a well-trained large model, we train BearingPGA-Net via decoupled knowledge distillation. Despite its small size, our model demonstrates excellent fault diagnosis performance compared to other lightweight state-of-the-art methods. Secondly, we design an FPGA acceleration scheme for BearingPGA-Net using Verilog. This scheme involves the customized quantization and designing programmable logic gates for each layer of BearingPGA-Net on the FPGA, with an emphasis on parallel computing and module reuse to enhance the computational speed. To the best of our knowledge, this is the first instance of deploying a CNN-based bearing fault diagnosis model on an FPGA. Experimental results reveal that our deployment scheme achieves over 200 times faster diagnosis speed compared to CPU, while achieving a lower-than-0.4\% performance drop in terms of F1, Recall, and Precision score on our independently-collected bearing dataset. Our code is available at \url{https://github.com/asdvfghg/BearingPGA-Net}.

CEAug 26, 2022
A Comprehensive Review of Digital Twin -- Part 1: Modeling and Twinning Enabling Technologies

Adam Thelen, Xiaoge Zhang, Olga Fink et al.

As an emerging technology in the era of Industry 4.0, digital twin is gaining unprecedented attention because of its promise to further optimize process design, quality control, health monitoring, decision and policy making, and more, by comprehensively modeling the physical world as a group of interconnected digital models. In a two-part series of papers, we examine the fundamental role of different modeling techniques, twinning enabling technologies, and uncertainty quantification and optimization methods commonly used in digital twins. This first paper presents a thorough literature review of digital twin trends across many disciplines currently pursuing this area of research. Then, digital twin modeling and twinning enabling technologies are further analyzed by classifying them into two main categories: physical-to-virtual, and virtual-to-physical, based on the direction in which data flows. Finally, this paper provides perspectives on the trajectory of digital twin technology over the next decade, and introduces a few emerging areas of research which will likely be of great use in future digital twin research. In part two of this review, the role of uncertainty quantification and optimization are discussed, a battery digital twin is demonstrated, and more perspectives on the future of digital twin are shared.

IRNov 8, 2022
Towards Adversarially Robust Recommendation from Adaptive Fraudster Detection

Yuni Lai, Yulin Zhu, Wenqi Fan et al.

The robustness of recommender systems under node injection attacks has garnered significant attention. Recently, GraphRfi, a GNN-based recommender system, was proposed and shown to effectively mitigate the impact of injected fake users. However, we demonstrate that GraphRfi remains vulnerable to attacks due to the supervised nature of its fraudster detection component, where obtaining clean labels is challenging in practice. In particular, we propose a powerful poisoning attack, MetaC, against both GNN-based and MF-based recommender systems. Furthermore, we analyze why GraphRfi fails under such an attack. Then, based on our insights obtained from vulnerability analysis, we design an adaptive fraudster detection module that explicitly considers label uncertainty. This module can serve as a plug-in for different recommender systems, resulting in a robust framework named PDR. Comprehensive experiments show that our defense approach outperforms other benchmark methods under attacks. Overall, our research presents an effective framework for integrating fraudster detection into recommendation systems to achieve adversarial robustness.

NEApr 2, 2022
Quadratic Neuron-empowered Heterogeneous Autoencoder for Unsupervised Anomaly Detection

Jing-Xiao Liao, Bo-Jian Hou, Hang-Cheng Dong et al.

Inspired by the complexity and diversity of biological neurons, a quadratic neuron is proposed to replace the inner product in the current neuron with a simplified quadratic function. Employing such a novel type of neurons offers a new perspective on developing deep learning. When analyzing quadratic neurons, we find that there exists a function such that a heterogeneous network can approximate it well with a polynomial number of neurons but a purely conventional or quadratic network needs an exponential number of neurons to achieve the same level of error. Encouraged by this inspiring theoretical result on heterogeneous networks, we directly integrate conventional and quadratic neurons in an autoencoder to make a new type of heterogeneous autoencoders. To our best knowledge, it is the first heterogeneous autoencoder that is made of different types of neurons. Next, we apply the proposed heterogeneous autoencoder to unsupervised anomaly detection for tabular data and bearing fault signals. The anomaly detection faces difficulties such as data unknownness, anomaly feature heterogeneity, and feature unnoticeability, which is suitable for the proposed heterogeneous autoencoder. Its high feature representation ability can characterize a variety of anomaly data (heterogeneity), discriminate the anomaly from the normal (unnoticeability), and accurately learn the distribution of normal samples (unknownness). Experiments show that heterogeneous autoencoders perform competitively compared to other state-of-the-art models.

SENov 1, 2025
HIP-LLM: A Hierarchical Imprecise Probability Approach to Reliability Assessment of Large Language Models

Robab Aghazadeh-Chakherlou, Qing Guo, Siddartha Khastgir et al.

Large Language Models (LLMs) are increasingly deployed across diverse domains, raising the need for rigorous reliability assessment methods. Existing benchmark-based evaluations primarily offer descriptive statistics of model accuracy over datasets, providing limited insight into the probabilistic behavior of LLMs under real operational conditions. This paper introduces HIP-LLM, a Hierarchical Imprecise Probability framework for modeling and inferring LLM reliability. Building upon the foundations of software reliability engineering, HIP-LLM defines LLM reliability as the probability of failure-free operation over a specified number of future tasks under a given Operational Profile (OP). HIP-LLM represents dependencies across (sub-)domains hierarchically, enabling multi-level inference from subdomain to system-level reliability. HIP-LLM embeds imprecise priors to capture epistemic uncertainty and incorporates OPs to reflect usage contexts. It derives posterior reliability envelopes that quantify uncertainty across priors and data. Experiments on multiple benchmark datasets demonstrate that HIP-LLM offers a more accurate and standardized reliability characterization than existing benchmark and state-of-the-art approaches. A publicly accessible repository of HIP-LLM is provided.

LGNov 29, 2023
Enhancing the Performance of Neural Networks Through Causal Discovery and Integration of Domain Knowledge

Xiaoge Zhang, Xiao-Lin Wang, Fenglei Fan et al.

In this paper, we develop a generic methodology to encode hierarchical causality structure among observed variables into a neural network in order to improve its predictive performance. The proposed methodology, called causality-informed neural network (CINN), leverages three coherent steps to systematically map the structural causal knowledge into the layer-to-layer design of neural network while strictly preserving the orientation of every causal relationship. In the first step, CINN discovers causal relationships from observational data via directed acyclic graph (DAG) learning, where causal discovery is recast as a continuous optimization problem to avoid the combinatorial nature. In the second step, the discovered hierarchical causality structure among observed variables is systematically encoded into neural network through a dedicated architecture and customized loss function. By categorizing variables in the causal DAG as root, intermediate, and leaf nodes, the hierarchical causal DAG is translated into CINN with a one-to-one correspondence between nodes in the causal DAG and units in the CINN while maintaining the relative order among these nodes. Regarding the loss function, both intermediate and leaf nodes in the DAG graph are treated as target outputs during CINN training so as to drive co-learning of causal relationships among different types of nodes. As multiple loss components emerge in CINN, we leverage the projection of conflicting gradients to mitigate gradient interference among the multiple learning tasks. Computational experiments across a broad spectrum of UCI data sets demonstrate substantial advantages of CINN in predictive performance over other state-of-the-art methods. In addition, an ablation study underscores the value of integrating structural and quantitative causal knowledge in enhancing the neural network's predictive performance incrementally.

86.8LGApr 15
Dataset-Level Metrics Attenuate Non-Determinism: A Fine-Grained Non-Determinism Evaluation in Diffusion Language Models

Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen et al.

Diffusion language models (DLMs) have emerged as a promising paradigm for large language models (LLMs), yet the non-deterministic behavior of DLMs remains poorly understood. The existing non-determinism evaluations for LLMs predominantly rely on dataset-level metrics under fixed inference configurations, providing limited insight into how model behavior varies across runs and evaluation conditions. In this work, we show that dataset-level metrics systematically attenuate non-determinism in diffusion language models by aggregating sample-level prediction quality across different runs. As a result, configurations with similar aggregate performance can exhibit substantially different behaviors on individual inputs, leaving fine-grained instability and distinct error patterns uncharacterized. To address this limitation, we conduct a fine-grained evaluation of non-determinism based on sample-level prediction differences across a range of model-related factors-including guidance scale, diffusion steps, and Monte Carlo sampling-as well as system-related factors such as batch size, hardware, and numerical precision. Our analysis reveals that non-determinism in DLMs is pervasive and structured, with code generation exhibiting markedly higher sensitivity to factor-level choices than question answering. To attribute sources of non-determinism evaluation, we introduce Factor Variance Attribution (FVA), a cross-factor analysis metric that decomposes observed non-determinism into variance attributable to different evaluation factor settings. Our findings highlight the need for fine-grained, factor-aware evaluation to enable reliable non-determinism assessment of diffusion language models.

CROct 16, 2025Code
Stealthy Dual-Trigger Backdoors: Attacking Prompt Tuning in LM-Empowered Graph Foundation Models

Xiaoyu Xue, Yuni Lai, Chenxi Huang et al.

The emergence of graph foundation models (GFMs), particularly those incorporating language models (LMs), has revolutionized graph learning and demonstrated remarkable performance on text-attributed graphs (TAGs). However, compared to traditional GNNs, these LM-empowered GFMs introduce unique security vulnerabilities during the unsecured prompt tuning phase that remain understudied in current research. Through empirical investigation, we reveal a significant performance degradation in traditional graph backdoor attacks when operating in attribute-inaccessible constrained TAG systems without explicit trigger node attribute optimization. To address this, we propose a novel dual-trigger backdoor attack framework that operates at both text-level and struct-level, enabling effective attacks without explicit optimization of trigger node text attributes through the strategic utilization of a pre-established text pool. Extensive experimental evaluations demonstrate that our attack maintains superior clean accuracy while achieving outstanding attack success rates, including scenarios with highly concealed single-trigger nodes. Our work highlights critical backdoor risks in web-deployed LM-empowered GFMs and contributes to the development of more robust supervision mechanisms for open-source platforms in the era of foundation models.

LGJun 1, 2025Code
NeuronSeek: On Stability and Expressivity of Task-driven Neurons

Hanyu Pei, Jing-Xiao Liao, Qibin Zhao et al.

Drawing inspiration from our human brain that designs different neurons for different tasks, recent advances in deep learning have explored modifying a network's neurons to develop so-called task-driven neurons. Prototyping task-driven neurons (referred to as NeuronSeek) employs symbolic regression (SR) to discover the optimal neuron formulation and construct a network from these optimized neurons. Along this direction, this work replaces symbolic regression with tensor decomposition (TD) to discover optimal neuronal formulations, offering enhanced stability and faster convergence. Furthermore, we establish theoretical guarantees that modifying the aggregation functions with common activation functions can empower a network with a fixed number of parameters to approximate any continuous function with an arbitrarily small error, providing a rigorous mathematical foundation for the NeuronSeek framework. Extensive empirical evaluations demonstrate that our NeuronSeek-TD framework not only achieves superior stability, but also is competitive relative to the state-of-the-art models across diverse benchmarks. The code is available at https://github.com/HanyuPei22/NeuronSeek.

LGAug 23, 2024
Causally-Aware Spatio-Temporal Multi-Graph Convolution Network for Accurate and Reliable Traffic Prediction

Pingping Dong, Xiao-Lin Wang, Indranil Bose et al.

Accurate and reliable prediction has profound implications to a wide range of applications. In this study, we focus on an instance of spatio-temporal learning problem--traffic prediction--to demonstrate an advanced deep learning model developed for making accurate and reliable forecast. Despite the significant progress in traffic prediction, limited studies have incorporated both explicit and implicit traffic patterns simultaneously to improve prediction performance. Meanwhile, the variability nature of traffic states necessitates quantifying the uncertainty of model predictions in a statistically principled way; however, extant studies offer no provable guarantee on the statistical validity of confidence intervals in reflecting its actual likelihood of containing the ground truth. In this paper, we propose an end-to-end traffic prediction framework that leverages three primary components to generate accurate and reliable traffic predictions: dynamic causal structure learning for discovering implicit traffic patterns from massive traffic data, causally-aware spatio-temporal multi-graph convolution network (CASTMGCN) for learning spatio-temporal dependencies, and conformal prediction for uncertainty quantification. CASTMGCN fuses several graphs that characterize different important aspects of traffic networks and an auxiliary graph that captures the effect of exogenous factors on the road network. On this basis, a conformal prediction approach tailored to spatio-temporal data is further developed for quantifying the uncertainty in node-wise traffic predictions over varying prediction horizons. Experimental results on two real-world traffic datasets demonstrate that the proposed method outperforms several state-of-the-art models in prediction accuracy; moreover, it generates more efficient prediction regions than other methods while strictly satisfying the statistical validity in coverage.

LGJun 2, 2025
Recent Developments in GNNs for Drug Discovery

Zhengyu Fang, Xiaoge Zhang, Anyin Zhao et al.

In this paper, we review recent developments and the role of Graph Neural Networks (GNNs) in computational drug discovery, including molecule generation, molecular property prediction, and drug-drug interaction prediction. By summarizing the most recent developments in this area, we underscore the capabilities of GNNs to comprehend intricate molecular patterns, while exploring both their current and prospective applications. We initiate our discussion by examining various molecular representations, followed by detailed discussions and categorization of existing GNN models based on their input types and downstream application tasks. We also collect a list of commonly used benchmark datasets for a variety of applications. We conclude the paper with brief discussions and summarize common trends in this important research area.

SPApr 11, 2024
Classifier-guided neural blind deconvolution: a physics-informed denoising module for bearing fault diagnosis under heavy noise

Jing-Xiao Liao, Chao He, Jipu Li et al.

Blind deconvolution (BD) has been demonstrated as an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD's desirable feature in adaptability and mathematical interpretability, a significant challenge persists: How to effectively integrate BD with fault-diagnosing classifiers? This issue arises because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with downstream deep learning classifiers, the different learning objectives will be in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault classification. Firstly, we present a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating the seamless integration of BD and the deep learning classifier for co-optimization of model parameters. Subsequently, we develop a unified framework to use a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, $l_2/l_4$ norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. Consequently, the fault labels provide useful information to direct BD to extract features that distinguish classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three datasets demonstrate that ClassBD outperforms other state-of-the-art methods under noisy conditions.

LGMay 28, 2025
A Closer Look on Memorization in Tabular Diffusion Model: A Data-Centric Perspective

Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen et al.

Diffusion models have shown strong performance in generating high-quality tabular data, but they carry privacy risks by reproducing exact training samples. While prior work focuses on dataset-level augmentation to reduce memorization, little is known about which individual samples contribute most. We present the first data-centric study of memorization dynamics in tabular diffusion models. We quantify memorization for each real sample based on how many generated samples are flagged as replicas, using a relative distance ratio. Our empirical analysis reveals a heavy-tailed distribution of memorization counts: a small subset of samples contributes disproportionately to leakage, confirmed via sample-removal experiments. To understand this, we divide real samples into top- and non-top-memorized groups and analyze their training-time behaviors. We track when each sample is first memorized and monitor per-epoch memorization intensity (AUC). Memorized samples are memorized slightly earlier and show stronger signals in early training. Based on these insights, we propose DynamicCut, a two-stage, model-agnostic mitigation method: (a) rank samples by epoch-wise intensity, (b) prune a tunable top fraction, and (c) retrain on the filtered dataset. Across multiple tabular datasets and models, DynamicCut reduces memorization with minimal impact on data diversity and downstream performance. It also complements augmentation-based defenses. Furthermore, DynamicCut enables cross-model transferability: high-ranked samples identified from one model (e.g., a diffusion model) are also effective for reducing memorization when removed from others, such as GANs and VAEs.

LGJan 11, 2025
Dual-Modality Representation Learning for Molecular Property Prediction

Anyin Zhao, Zuquan Chen, Zhengyu Fang et al.

Molecular property prediction has attracted substantial attention recently. Accurate prediction of drug properties relies heavily on effective molecular representations. The structures of chemical compounds are commonly represented as graphs or SMILES sequences. Recent advances in learning drug properties commonly employ Graph Neural Networks (GNNs) based on the graph representation. For the SMILES representation, Transformer-based architectures have been adopted by treating each SMILES string as a sequence of tokens. Because each representation has its own advantages and disadvantages, combining both representations in learning drug properties is a promising direction. We propose a method named Dual-Modality Cross-Attention (DMCA) that can effectively combine the strengths of two representations by employing the cross-attention mechanism. DMCA was evaluated across eight datasets including both classification and regression tasks. Results show that our method achieves the best overall performance, highlighting its effectiveness in leveraging the complementary information from both graph and SMILES modalities.

CVNov 25, 2025
FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds

Xiaoge Zhang, Zijie Wu, Mingtao Feng et al.

Point cloud compression methods jointly optimize bitrates and reconstruction distortion. However, balancing compression ratio and reconstruction quality is difficult because low-frequency and high-frequency components contribute differently at the same resolution. To address this, we propose FLaTEC, a frequency-aware compression model that enables the compression of a full scan with high compression ratios. Our approach introduces a frequency-aware mechanism that decouples low-frequency structures and high-frequency textures, while hybridizing latent triplanes as a compact proxy for point cloud. Specifically, we convert voxelized embeddings into triplane representations to reduce sparsity, computational cost, and storage requirements. We then devise a frequency-disentangling technique that extracts compact low-frequency content while collecting high-frequency details across scales. The decoupled low-frequency and high-frequency components are stored in binary format. During decoding, full-spectrum signals are progressively recovered via a modulation block. Additionally, to compensate for the loss of 3D correlation, we introduce an efficient frequency-based attention mechanism that fosters local connectivity and outputs arbitrary resolution points. Our method achieves state-of-the-art rate-distortion performance and outperforms the standard codecs by 78\% and 94\% in BD-rate on both SemanticKITTI and Ford datasets.

IVDec 28, 2024
Implementing Trust in Non-Small Cell Lung Cancer Diagnosis with a Conformalized Uncertainty-Aware AI Framework in Whole-Slide Images

Xiaoge Zhang, Tao Wang, Chao Yan et al.

Ensuring trustworthiness is fundamental to the development of artificial intelligence (AI) that is considered societally responsible, particularly in cancer diagnostics, where a misdiagnosis can have dire consequences. Current digital pathology AI models lack systematic solutions to address trustworthiness concerns arising from model limitations and data discrepancies between model deployment and development environments. To address this issue, we developed TRUECAM, a framework designed to ensure both data and model trustworthiness in non-small cell lung cancer subtyping with whole-slide images. TRUECAM integrates 1) a spectral-normalized neural Gaussian process for identifying out-of-scope inputs and 2) an ambiguity-guided elimination of tiles to filter out highly ambiguous regions, addressing data trustworthiness, as well as 3) conformal prediction to ensure controlled error rates. We systematically evaluated the framework across multiple large-scale cancer datasets, leveraging both task-specific and foundation models, illustrate that an AI model wrapped with TRUECAM significantly outperforms models that lack such guidance, in terms of classification accuracy, robustness, interpretability, and data efficiency, while also achieving improvements in fairness. These findings highlight TRUECAM as a versatile wrapper framework for digital pathology AI models with diverse architectural designs, promoting their responsible and effective applications in real-world settings.

CVNov 21, 2024
DiffCom: Decoupled Sparse Priors Guided Diffusion Compression for Point Clouds

Xiaoge Zhang, Zijie Wu, Mehwish Nasim et al.

Lossy compression relies on an autoencoder to transform a point cloud into latent points for storage, leaving the inherent redundancy of latent representations unexplored. To reduce redundancy in latent points, we propose a diffusion-based framework guided by sparse priors that achieves high reconstruction quality, especially at low bitrates. Our approach features an efficient dual-density data flow that relaxes size constraints on latent points. It hybridizes a probabilistic conditional diffusion model to encapsulate essential details for reconstruction within sparse priors, which are decoupled hierarchically into intra- and inter-point priors. Specifically, our DiffCom encodes the original point cloud into latent points and decoupled sparse priors through separate encoders. To dynamically attend to geometric and semantic cues from the priors at each encoding and decoding layer, we employ an attention-guided latent denoiser conditioned on the decoupled priors. Additionally, we integrate the local distribution into the arithmetic encoder and decoder to enhance local context modeling of the sparse points. The original point cloud is reconstructed through a point decoder. Compared to state-of-the-art methods, our approach achieves a superior rate-distortion trade-off, as evidenced by extensive evaluations on the ShapeNet dataset and standard test datasets from the MPEG PCC Group.

LGMay 7, 2023
Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial

Venkat Nemani, Luca Biggio, Xun Huan et al.

On top of machine learning models, uncertainty quantification (UQ) functions as an essential layer of safety assurance that could lead to more principled decision making by enabling sound risk assessment and management. The safety and reliability improvement of ML models empowered by UQ has the potential to significantly facilitate the broad adoption of ML solutions in high-stakes decision settings, such as healthcare, manufacturing, and aviation, to name a few. In this tutorial, we aim to provide a holistic lens on emerging UQ methods for ML models with a particular focus on neural networks and the applications of these UQ methods in tackling engineering design as well as prognostics and health management problems. Toward this goal, we start with a comprehensive classification of uncertainty types, sources, and causes pertaining to UQ of ML models. Next, we provide a tutorial-style description of several state-of-the-art UQ methods: Gaussian process regression, Bayesian neural network, neural network ensemble, and deterministic UQ methods focusing on spectral-normalized neural Gaussian process. Established upon the mathematical formulations, we subsequently examine the soundness of these UQ methods quantitatively and qualitatively (by a toy regression example) to examine their strengths and shortcomings from different dimensions. Then, we review quantitative metrics commonly used to assess the quality of predictive uncertainty in classification and regression problems. Afterward, we discuss the increasingly important role of UQ of ML models in solving challenging problems in engineering design and health prognostics. Two case studies with source codes available on GitHub are used to demonstrate these UQ methods and compare their performance in the life prediction of lithium-ion batteries at the early stage and the remaining useful life prediction of turbofan engines.

LGDec 1, 2021
A generic physics-informed neural network-based framework for reliability assessment of multi-state systems

Taotao Zhou, Xiaoge Zhang, Enrique Lopez Droguett et al.

In this paper, we leverage the recent advances in physics-informed neural network (PINN) and develop a generic PINN-based framework to assess the reliability of multi-state systems (MSSs). The proposed methodology consists of two major steps. In the first step, we recast the reliability assessment of MSS as a machine learning problem using the framework of PINN. A feedforward neural network with two individual loss groups are constructed to encode the initial condition and state transitions governed by ordinary differential equations (ODEs) in MSS. Next, we tackle the problem of high imbalance in the magnitude of the back-propagated gradients in PINN from a multi-task learning perspective. Particularly, we treat each element in the loss function as an individual task, and adopt a gradient surgery approach named projecting conflicting gradients (PCGrad), where a task's gradient is projected onto the norm plane of any other task that has a conflicting gradient. The gradient projection operation significantly mitigates the detrimental effects caused by the gradient interference when training PINN, thus accelerating the convergence speed of PINN to high-precision solutions to MSS reliability assessment. With the proposed PINN-based framework, we investigate its applications for MSS reliability assessment in several different contexts in terms of time-independent or dependent state transitions and system scales varying from small to medium. The results demonstrate that the proposed PINN-based framework shows generic and remarkable performance in MSS reliability assessment, and the incorporation of PCGrad in PINN leads to substantial improvement in solution quality and convergence speed.

AIJun 9, 2014
A bio-inspired algorithm for fuzzy user equilibrium problem by aid of Physarum Polycephalum

Yang Liu, Xiaoge Zhang, Yong Deng

The user equilibrium in traffic assignment problem is based on the fact that travelers choose the minimum-cost path between every origin-destination pair and on the assumption that such a behavior will lead to an equilibrium of the traffic network. In this paper, we consider this problem when the traffic network links are fuzzy cost. Therefore, a Physarum-type algorithm is developed to unify the Physarum network and the traffic network for taking full of advantage of Physarum Polycephalum's adaptivity in network design to solve the user equilibrium problem. Eventually, some experiments are used to test the performance of this method. The results demonstrate that our approach is competitive when compared with other existing algorithms.

NEMar 21, 2014
A Physarum-Inspired Approach to Optimal Supply Chain Network Design at Minimum Total Cost with Demand Satisfaction

Xiaoge Zhang, Andrew Adamatzky, Xin-She Yang et al.

A supply chain is a system which moves products from a supplier to customers. The supply chains are ubiquitous. They play a key role in all economic activities. Inspired by biological principles of nutrients' distribution in protoplasmic networks of slime mould Physarum polycephalum we propose a novel algorithm for a supply chain design. The algorithm handles the supply networks where capacity investments and product flows are variables. The networks are constrained by a need to satisfy product demands. Two features of the slime mould are adopted in our algorithm. The first is the continuity of a flux during the iterative process, which is used in real-time update of the costs associated with the supply links. The second feature is adaptivity. The supply chain can converge to an equilibrium state when costs are changed. Practicality and flexibility of our algorithm is illustrated on numerical examples.

NENov 3, 2013
An Adaptive Amoeba Algorithm for Shortest Path Tree Computation in Dynamic Graphs

Xiaoge Zhang, Qi Liu, Yong Hu et al.

This paper presents an adaptive amoeba algorithm to address the shortest path tree (SPT) problem in dynamic graphs. In dynamic graphs, the edge weight updates consists of three categories: edge weight increases, edge weight decreases, the mixture of them. Existing work on this problem solve this issue through analyzing the nodes influenced by the edge weight updates and recompute these affected vertices. However, when the network becomes big, the process will become complex. The proposed method can overcome the disadvantages of the existing approaches. The most important feature of this algorithm is its adaptivity. When the edge weight changes, the proposed algorithm can recognize the affected vertices and reconstruct them spontaneously. To evaluate the proposed adaptive amoeba algorithm, we compare it with the Label Setting algorithm and Bellman-Ford algorithm. The comparison results demonstrate the effectiveness of the proposed method.