Ziqi Chen

LG
h-index17
30papers
373citations
Novelty56%
AI Score58

30 Papers

SDJun 1
MOSS-Audio Technical Report

Chen Yang, Chufan Yu, Hanfu Chen et al.

MOSS-Audio is a unified audio-language model for speech, environmental sound, and music understanding, supporting audio captioning, time-aware question answering, timestamped transcription, and audio-grounded reasoning. MOSS-Audio couples a dedicated audio encoder with a modality adapter and a large language model: the encoder produces 12.5 Hz temporal representations, the adapter projects them into the decoder space, and the decoder generates autoregressive text outputs. Two design choices are central to the system: \textbf{DeepStack cross-layer feature injection}, which exposes the decoder to acoustic information from multiple encoder depths, and \textbf{time markers}, which provide explicit temporal cues by inserting timestamp markers into the audio-token stream. At the data level, we design an event-preserving audio annotation pipeline that segments raw audio at coherent event boundaries, applies branch-specific annotation to speech, music, and general audio, and merges the results into unified captions for pretraining. The intermediate branch-specific captions are further retained to support the construction of task-oriented SFT data. The model is pretrained on large-scale audio-language data, with time-aware objectives incorporated to support temporal grounding, and then undergoes multi-stage post-training to enhance instruction following and audio-grounded reasoning. We release 4B and 8B variants in both Instruct and Thinking configurations. MOSS-Audio achieves strong performance across general audio understanding, speech captioning, ASR, and timestamped ASR, positioning it as a promising understanding foundation for future voice agents.

SRMar 11, 2022Code
A comparative study of non-deep learning, deep learning, and ensemble learning methods for sunspot number prediction

Yuchen Dang, Ziqi Chen, Heng Li et al.

Solar activity has significant impacts on human activities and health. One most commonly used measure of solar activity is the sunspot number. This paper compares three important non-deep learning models, four popular deep learning models, and their five ensemble models in forecasting sunspot numbers. In particular, we propose an ensemble model called XGBoost-DL, which uses XGBoost as a two-level nonlinear ensemble method to combine the deep learning models. Our XGBoost-DL achieves the best forecasting performance (RMSE = 25.70 and MAE = 19.82) in the comparison, outperforming the best non-deep learning model SARIMA (RMSE = 54.11 and MAE = 45.51), the best deep learning model Informer (RMSE = 29.90 and MAE = 22.35) and the NASA's forecast (RMSE = 48.38 and MAE = 38.45). Our XGBoost-DL forecasts a peak sunspot number of 133.47 in May 2025 for Solar Cycle 25 and 164.62 in November 2035 for Solar Cycle 26, similar to but later than the NASA's at 137.7 in October 2024 and 161.2 in December 2034. An open-source Python package of our XGBoost-DL for the sunspot number prediction is available at https://github.com/yd1008/ts_ensemble_sunspot.

LGAug 23, 2023
Shape-conditioned 3D Molecule Generation via Equivariant Diffusion Models

Ziqi Chen, Bo Peng, Srinivasan Parthasarathy et al.

Ligand-based drug design aims to identify novel drug candidates of similar shapes with known active molecules. In this paper, we formulated an in silico shape-conditioned molecule generation problem to generate 3D molecule structures conditioned on the shape of a given molecule. To address this problem, we developed a translation- and rotation-equivariant shape-guided generative model ShapeMol. ShapeMol consists of an equivariant shape encoder that maps molecular surface shapes into latent embeddings, and an equivariant diffusion model that generates 3D molecules based on these embeddings. Experimental results show that ShapeMol can generate novel, diverse, drug-like molecules that retain 3D molecular shapes similar to the given shape condition. These results demonstrate the potential of ShapeMol in designing drug candidates of desired 3D shapes binding to protein target pockets.

LGApr 9, 2023
Nearest-Neighbor Sampling Based Conditional Independence Testing

Shuai Li, Ziqi Chen, Hongtu Zhu et al.

The conditional randomization test (CRT) was recently proposed to test whether two random variables X and Y are conditionally independent given random variables Z. The CRT assumes that the conditional distribution of X given Z is known under the null hypothesis and then it is compared to the distribution of the observed samples of the original data. The aim of this paper is to develop a novel alternative of CRT by using nearest-neighbor sampling without assuming the exact form of the distribution of X given Z. Specifically, we utilize the computationally efficient 1-nearest-neighbor to approximate the conditional distribution that encodes the null hypothesis. Then, theoretically, we show that the distribution of the generated samples is very close to the true conditional distribution in terms of total variation distance. Furthermore, we take the classifier-based conditional mutual information estimator as our test statistic. The test statistic as an empirical fundamental information theoretic quantity is able to well capture the conditional-dependence feature. We show that our proposed test is computationally very fast, while controlling type I and II errors quite well. Finally, we demonstrate the efficiency of our proposed test in both synthetic and real data analyses.

AIFeb 14, 2024Code
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset

Botao Yu, Frazier N. Baker, Ziqi Chen et al.

Chemistry plays a crucial role in many domains, such as drug discovery and material science. While large language models (LLMs) such as GPT-4 exhibit remarkable capabilities on natural language processing tasks, existing research indicates that their performance on chemistry tasks is discouragingly low. In this paper, however, we demonstrate that our developed LLMs can achieve very strong results on a comprehensive set of chemistry tasks, outperforming the most advanced GPT-4 and Claude 3 Opus by a substantial margin. To accomplish this, we propose SMolInstruct, a large-scale, comprehensive, and high-quality dataset for instruction tuning. It contains 14 selected chemistry tasks and over three million samples, laying a solid foundation for training and evaluating LLMs for chemistry. Using SMolInstruct, we fine-tune a set of open-source LLMs, among which, we find that Mistral serves as the best base model for chemistry tasks. Our analysis further demonstrates the critical role of the proposed dataset in driving the performance improvements.

NEJan 15
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution

Minghao Yan, Bo Peng, Benjamin Coleman et al.

Large Language Models (LLMs) have emerged as powerful operators for evolutionary search, yet the design of efficient search scaffolds remains ad hoc. While promising, current LLM-in-the-loop systems lack a systematic approach to managing the evolutionary process. We identify three distinct failure modes: Context Pollution, where experiment history biases future candidate generation; Mode Collapse, where agents stagnate in local minima due to poor exploration-exploitation balance; and Weak Collaboration, where rigid crossover strategies fail to leverage parallel search trajectories effectively. We introduce Progress-Aware Consistent Evolution (PACEvolve), a framework designed to robustly govern the agent's context and search dynamics, to address these challenges. PACEvolve combines hierarchical context management (HCM) with pruning to address context pollution; momentum-based backtracking (MBB) to escape local minima; and a self-adaptive sampling policy that unifies backtracking and crossover for dynamic search coordination (CE), allowing agents to balance internal refinement with cross-trajectory collaboration. We demonstrate that PACEvolve provides a systematic path to consistent, long-horizon self-improvement, achieving state-of-the-art results on LLM-SR and KernelBench, while discovering solutions surpassing the record on Modded NanoGPT.

SDMar 18
MOSS-TTS Technical Report

Yitian Gong, Botian Jiang, Yiwei Zhao et al.

This technical report presents MOSS-TTS, a speech generation foundation model built on a scalable recipe: discrete audio tokens, autoregressive modeling, and large-scale pretraining. Built on MOSS-Audio-Tokenizer, a causal Transformer tokenizer that compresses 24 kHz audio to 12.5 fps with variable-bitrate RVQ and unified semantic-acoustic representations, we release two complementary generators: MOSS-TTS, which emphasizes structural simplicity, scalability, and long-context/control-oriented deployment, and MOSS-TTS-Local-Transformer, which introduces a frame-local autoregressive module for higher modeling efficiency, stronger speaker preservation, and a shorter time to first audio. Across multilingual and open-domain settings, MOSS-TTS supports zero-shot voice cloning, token-level duration control, phoneme-/pinyin-level pronunciation control, smooth code-switching, and stable long-form generation. This report summarizes the design, training recipe, and empirical characteristics of the released models.

QMMar 2, 2023
T-Cell Receptor Optimization with Reinforcement Learning and Mutation Policies for Precesion Immunotherapy

Ziqi Chen, Martin Renqiang Min, Hongyu Guo et al.

T cells monitor the health status of cells by identifying foreign peptides displayed on their surface. T-cell receptors (TCRs), which are protein complexes found on the surface of T cells, are able to bind to these peptides. This process is known as TCR recognition and constitutes a key step for immune response. Optimizing TCR sequences for TCR recognition represents a fundamental step towards the development of personalized treatments to trigger immune responses killing cancerous or virus-infected cells. In this paper, we formulated the search for these optimized TCRs as a reinforcement learning (RL) problem, and presented a framework TCRPPO with a mutation policy using proximal policy optimization. TCRPPO mutates TCRs into effective ones that can recognize given peptides. TCRPPO leverages a reward function that combines the likelihoods of mutated sequences being valid TCRs measured by a new scoring function based on deep autoencoders, with the probabilities of mutated sequences recognizing peptides from a peptide-TCR interaction predictor. We compared TCRPPO with multiple baseline methods and demonstrated that TCRPPO significantly outperforms all the baseline methods to generate positive binding and valid TCRs. These results demonstrate the potential of TCRPPO for both precision immunotherapy and peptide-recognizing TCR motif discovery.

LGSep 6, 2023
RLSynC: Offline-Online Reinforcement Learning for Synthon Completion

Frazier N. Baker, Ziqi Chen, Daniel Adu-Ampratwum et al.

Retrosynthesis is the process of determining the set of reactant molecules that can react to form a desired product. Semi-template-based retrosynthesis methods, which imitate the reverse logic of synthesis reactions, first predict the reaction centers in the products, and then complete the resulting synthons back into reactants. We develop a new offline-online reinforcement learning method RLSynC for synthon completion in semi-template-based methods. RLSynC assigns one agent to each synthon, all of which complete the synthons by conducting actions step by step in a synchronized fashion. RLSynC learns the policy from both offline training episodes and online interactions, which allows RLSynC to explore new reaction spaces. RLSynC uses a standalone forward synthesis model to evaluate the likelihood of the predicted reactants in synthesizing a product, and thus guides the action search. Our results demonstrate that RLSynC can outperform state-of-the-art synthon completion methods with improvements as high as 14.9%, highlighting its potential in synthesis planning.

NIMar 26
A Wireless World Model for AI-Native 6G Networks

Ziqi Chen, Yi Ren, Yixuan Huang et al.

Integrating AI into the physical layer is a cornerstone of 6G networks. However, current data-driven approaches struggle to generalize across dynamic environments because they lack an intrinsic understanding of electromagnetic wave propagation. We introduce the Wireless World Model (WWM), a multi-modal foundation framework predicting the spatiotemporal evolution of wireless channels by internalizing the causal relationship between 3D geometry and signal dynamics. Pre-trained on a massive ray-traced multi-modal dataset, WWM overcomes the data authenticity gap, further validated under real-world measurement data. Using a joint-embedding predictive architecture with a multi-modal mixture-of-experts Transformer, WWM fuses channel state information, 3D point clouds, and user trajectories into a unified representation. Across the five key downstream tasks supported by WWM, it achieves remarkable performance in seen environments, unseen generalization scenarios, and real-world measurements, consistently outperforming SOTA uni-modal foundation models and task-specific models. This paves the way for physics-aware 6G intelligence that adapts to the physical world.

CVJun 15, 2025Code
Unleashing Diffusion and State Space Models for Medical Image Segmentation

Rong Wu, Ziqi Chen, Liming Zhong et al.

Existing segmentation models trained on a single medical imaging dataset often lack robustness when encountering unseen organs or tumors. Developing a robust model capable of identifying rare or novel tumor categories not present during training is crucial for advancing medical imaging applications. We propose DSM, a novel framework that leverages diffusion and state space models to segment unseen tumor categories beyond the training data. DSM utilizes two sets of object queries trained within modified attention decoders to enhance classification accuracy. Initially, the model learns organ queries using an object-aware feature grouping strategy to capture organ-level visual features. It then refines tumor queries by focusing on diffusion-based visual prompts, enabling precise segmentation of previously unseen tumors. Furthermore, we incorporate diffusion-guided feature fusion to improve semantic segmentation performance. By integrating CLIP text embeddings, DSM captures category-sensitive classes to improve linguistic transfer knowledge, thereby enhancing the model's robustness across diverse scenarios and multi-label tasks. Extensive experiments demonstrate the superior performance of DSM in various tumor segmentation tasks. Code is available at https://github.com/Rows21/k-Means_Mask_Mamba.

LGMay 7
PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents

Minghao Yan, Bo Peng, Benjamin Coleman et al.

Large language models have become drivers of evolutionary search, but most systems rely on a fixed, prompt-elicited policy to sample next candidates. This limits adaptation in practical engineering and research tasks, where evaluations are expensive, and progress depends on learning task-specific search dynamics. We introduce PACEvolve++, an advisor-model reinforcement learning framework for test-time policy adaptation in evolutionary search agents. PACEvolve++ decouples strategic search decisions from implementation: a trainable advisor generates, assesses, and selects hypotheses, while a stronger frontier model translates selected hypotheses into executable candidates. To train the advisor under non-stationary feedback, we propose a phase-adaptive approach that adapts its optimization strategy to different phases of the evolutionary process. Early in evolution, it uses group-relative feedback to learn broad search preferences; later, as reward gaps compress, it emphasizes best-of-$k$ frontier contribution to support stable refinement. Across expert-parallel load balancing, sequential recommendation, and protein fitness extrapolation, PACEvolve++ outperforms the state-of-the-art evolutionary search framework with frontier models, achieving faster convergence and stabilizing test-time training during evolutionary search.

BMOct 20, 2024Code
log-RRIM: Yield Prediction via Local-to-global Reaction Representation Learning and Interaction Modeling

Xiao Hu, Ziqi Chen, Bo Peng et al.

Accurate prediction of chemical reaction yields is crucial for optimizing organic synthesis, potentially reducing time and resources spent on experimentation. With the rise of artificial intelligence (AI), there is growing interest in leveraging AI-based methods to accelerate yield predictions without conducting in vitro experiments. We present log-RRIM, an innovative graph transformer-based framework designed for predicting chemical reaction yields. A key feature of log-RRIM is its integration of a cross-attention mechanism that focuses on the interplay between reagents and reaction centers. This design reflects a fundamental principle in chemical reactions: the crucial role of reagents in influencing bond-breaking and formation processes, which ultimately affect reaction yields. log-RRIM also implements a local-to-global reaction representation learning strategy. This approach initially captures detailed molecule-level information and then models and aggregates intermolecular interactions. Through this hierarchical process, log-RRIM effectively captures how different molecular fragments contribute to and influence the overall reaction yield, regardless of their size variations. log-RRIM shows superior performance in our experiments, especially for medium to high-yielding reactions, proving its reliability as a predictor. The framework's sophisticated modeling of reactant-reagent interactions and precise capture of molecular fragment contributions make it a valuable tool for reaction planning and optimization in chemical synthesis. The data and codes of log-RRIM are accessible through https://github.com/ninglab/Yield_log_RRIM.

LGFeb 9, 2025
Generating 3D Binding Molecules Using Shape-Conditioned Diffusion Models with Guidance

Ziqi Chen, Bo Peng, Tianhua Zhai et al.

Drug development is a critical but notoriously resource- and time-consuming process. In this manuscript, we develop a novel generative artificial intelligence (genAI) method DiffSMol to facilitate drug development. DiffSmol generates 3D binding molecules based on the shapes of known ligands. DiffSMol encapsulates geometric details of ligand shapes within pre-trained, expressive shape embeddings and then generates new binding molecules through a diffusion model. DiffSMol further modifies the generated 3D structures iteratively via shape guidance to better resemble the ligand shapes. It also tailors the generated molecules toward optimal binding affinities under the guidance of protein pockets. Here, we show that DiffSMol outperforms the state-of-the-art methods on benchmark datasets. When generating binding molecules resembling ligand shapes, DiffSMol with shape guidance achieves a success rate 61.4%, substantially outperforming the best baseline (11.2%), meanwhile producing molecules with novel molecular graph structures. DiffSMol with pocket guidance also outperforms the best baseline in binding affinities by 13.2%, and even by 17.7% when combined with shape guidance. Case studies for two critical drug targets demonstrate very favorable physicochemical and pharmacokinetic properties of the generated molecules, thus, the potential of DiffSMol in developing promising drug candidates.

MLDec 16, 2024
Conditional Diffusion Models Based Conditional Independence Testing

Yanfeng Yang, Shuai Li, Yingjie Zhang et al.

Conditional independence (CI) testing is a fundamental task in modern statistics and machine learning. The conditional randomization test (CRT) was recently introduced to test whether two random variables, $X$ and $Y$, are conditionally independent given a potentially high-dimensional set of random variables, $Z$. The CRT operates exceptionally well under the assumption that the conditional distribution $X|Z$ is known. However, since this distribution is typically unknown in practice, accurately approximating it becomes crucial. In this paper, we propose using conditional diffusion models (CDMs) to learn the distribution of $X|Z$. Theoretically and empirically, it is shown that CDMs closely approximate the true conditional distribution. Furthermore, CDMs offer a more accurate approximation of $X|Z$ compared to GANs, potentially leading to a CRT that performs better than those based on GANs. To accommodate complex dependency structures, we utilize a computationally efficient classifier-based conditional mutual information (CMI) estimator as our test statistic. The proposed testing procedure performs effectively without requiring assumptions about specific distribution forms or feature dependencies, and is capable of handling mixed-type conditioning sets that include both continuous and discrete variables. Theoretical analysis shows that our proposed test achieves a valid control of the type I error. A series of experiments on synthetic data demonstrates that our new test effectively controls both type-I and type-II errors, even in high dimensional scenarios.

LGOct 13, 2025
FedLoRA-Optimizer: Federated LoRA Fine-Tuning with Global and Local Optimization in Heterogeneous Data Scenarios

Jianzhe Zhao, Hailin Zhu, Yu Zhang et al.

Federated efficient fine-tuning has emerged as an approach that leverages distributed data and computational resources across nodes to address the challenges of large-scale fine-tuning and privacy preservation. The Low-Rank Adaptation (LoRA) enables efficient fine-tuning of large-scale pre-trained models by introducing trainable low-rank matrices into weight updates.However, in heterogeneous data scenarios, client drift weakens the generalization of the global model, and local models often fail to meet the personalized needs of individual clients.Moreover, existing federated LoRA efficient fine-tuning techniques overlook fine-grained analysis of the tuning matrices. To address this, we conducted preliminary experiments and found that different LoRA matrices exhibit different sensitivity to changes in the direction and magnitude of their vectors.We thus propose a fine-grained federated LoRA tuning method. By fine-tuning the more sensitive directional vectors in the A matrix, which encode shared knowledge, our method learns shared features more effectively across clients and enhances global generalization. Simultaneously, by fine-tuning the more sensitive magnitude vectors in the B matrix, which encode personalized knowledge, our method better captures personalized knowledge, enabling detailed adaptation to local data. The method uses a pipeline combining global and local optimizers. Global optimization further improves local models, achieving collaborative optimization between global and local levels. This improves both the generalization ability of the global model and the personalized adaptation of local models under heterogeneous data scenarios. Experiments on Databricks-Dolly-15k and Natural Instructions with LLaMA2-7B and Deepseek-7B confirm that our method improves global performance by 0.39% and local performance by 0.59%.

MLSep 25, 2025
Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting

Yanfeng Yang, Siwei Chen, Pingping Hu et al.

Probabilistic forecasting of multivariate time series is challenging due to non-stationarity, inter-variable dependencies, and distribution shifts. While recent diffusion and flow matching models have shown promise, they often ignore informative priors such as conditional means and covariances. In this work, we propose Conditionally Whitened Generative Models (CW-Gen), a framework that incorporates prior information through conditional whitening. Theoretically, we establish sufficient conditions under which replacing the traditional terminal distribution of diffusion models, namely the standard multivariate normal, with a multivariate normal distribution parameterized by estimators of the conditional mean and covariance improves sample quality. Guided by this analysis, we design a novel Joint Mean-Covariance Estimator (JMCE) that simultaneously learns the conditional mean and sliding-window covariance. Building on JMCE, we introduce Conditionally Whitened Diffusion Models (CW-Diff) and extend them to Conditionally Whitened Flow Matching (CW-Flow). Experiments on five real-world datasets with six state-of-the-art generative models demonstrate that CW-Gen consistently enhances predictive performance, capturing non-stationary dynamics and inter-variable correlations more effectively than prior-free approaches. Empirical results further demonstrate that CW-Gen can effectively mitigate the effects of distribution shift.

SDSep 25, 2025
DiaMoE-TTS: A Unified IPA-Based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptation

Ziqi Chen, Gongyu Chen, Yihua Wang et al.

Dialect speech embodies rich cultural and linguistic diversity, yet building text-to-speech (TTS) systems for dialects remains challenging due to scarce data, inconsistent orthographies, and complex phonetic variation. To address these issues, we present DiaMoE-TTS, a unified IPA-based framework that standardizes phonetic representations and resolves grapheme-to-phoneme ambiguities. Built upon the F5-TTS architecture, the system introduces a dialect-aware Mixture-of-Experts (MoE) to model phonological differences and employs parameter-efficient adaptation with Low-Rank Adaptors (LoRA) and Conditioning Adapters for rapid transfer to new dialects. Unlike approaches dependent on large-scale or proprietary resources, DiaMoE-TTS enables scalable, open-data-driven synthesis. Experiments demonstrate natural and expressive speech generation, achieving zero-shot performance on unseen dialects and specialized domains such as Peking Opera with only a few hours of data.

LGMay 29, 2025
DiffER: Categorical Diffusion for Chemical Retrosynthesis

Sean Current, Ziqi Chen, Daniel Adu-Ampratwum et al.

Methods for automatic chemical retrosynthesis have found recent success through the application of models traditionally built for natural language processing, primarily through transformer neural networks. These models have demonstrated significant ability to translate between the SMILES encodings of chemical products and reactants, but are constrained as a result of their autoregressive nature. We propose DiffER, an alternative template-free method for retrosynthesis prediction in the form of categorical diffusion, which allows the entire output SMILES sequence to be predicted in unison. We construct an ensemble of diffusion models which achieves state-of-the-art performance for top-1 accuracy and competitive performance for top-3, top-5, and top-10 accuracy among template-free methods. We prove that DiffER is a strong baseline for a new class of template-free model, capable of learning a variety of synthetic techniques used in laboratory settings and outperforming a variety of other template-free methods on top-k accuracy metrics. By constructing an ensemble of categorical diffusion models with a novel length prediction component with variance, our method is able to approximately sample from the posterior distribution of reactants, producing results with strong metrics of confidence and likelihood. Furthermore, our analyses demonstrate that accurate prediction of the SMILES sequence length is key to further boosting the performance of categorical diffusion models.

MLMay 18, 2025
High-Dimensional Dynamic Covariance Models with Random Forests

Shuguang Yu, Fan Zhou, Yingjie Zhang et al.

This paper introduces a novel nonparametric method for estimating high-dimensional dynamic covariance matrices with multiple conditioning covariates, leveraging random forests and supported by robust theoretical guarantees. Unlike traditional static methods, our dynamic nonparametric covariance models effectively capture distributional heterogeneity. Furthermore, unlike kernel-smoothing methods, which are restricted to a single conditioning covariate, our approach accommodates multiple covariates in a fully nonparametric framework. To the best of our knowledge, this is the first method to use random forests for estimating high-dimensional dynamic covariance matrices. In high-dimensional settings, we establish uniform consistency theory, providing nonasymptotic error rates and model selection properties, even when the response dimension grows sub-exponentially with the sample size. These results hold uniformly across a range of conditioning variables. The method's effectiveness is demonstrated through simulations and a stock dataset analysis, highlighting its ability to model complex dynamics in high-dimensional scenarios.

IVMay 11, 2025
Uni-AIMS: AI-Powered Microscopy Image Analysis

Yanhui Hong, Nan Wang, Zhiyi Xia et al.

This paper presents a systematic solution for the intelligent recognition and automatic analysis of microscopy images. We developed a data engine that generates high-quality annotated datasets through a combination of the collection of diverse microscopy images from experiments, synthetic data generation and a human-in-the-loop annotation process. To address the unique challenges of microscopy images, we propose a segmentation model capable of robustly detecting both small and large objects. The model effectively identifies and separates thousands of closely situated targets, even in cluttered visual environments. Furthermore, our solution supports the precise automatic recognition of image scale bars, an essential feature in quantitative microscopic analysis. Building upon these components, we have constructed a comprehensive intelligent analysis platform and validated its effectiveness and practicality in real-world applications. This study not only advances automatic recognition in microscopy imaging but also ensures scalability and generalizability across multiple application domains, offering a powerful tool for automated microscopic analysis in interdisciplinary research. A online application is made available for researchers to access and evaluate the proposed automated analysis service.

MLFeb 26, 2025
Nonlinear Sparse Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data

Rong Wu, Ziqi Chen, Gen Li et al.

Motivation: Biomedical studies increasingly produce multi-view high-dimensional datasets (e.g., multi-omics) that demand integrative analysis. Existing canonical correlation analysis (CCA) and generalized CCA methods address at most two of the following three key aspects simultaneously: (i) nonlinear dependence, (ii) sparsity for variable selection, and (iii) generalization to more than two data views. There is a pressing need for CCA methods that integrate all three aspects to effectively analyze multi-view high-dimensional data. Results: We propose three nonlinear, sparse, generalized CCA methods, HSIC-SGCCA, SA-KGCCA, and TS-KGCCA, for variable selection in multi-view high-dimensional data. These methods extend existing SCCA-HSIC, SA-KCCA, and TS-KCCA from two-view to multi-view settings. While SA-KGCCA and TS-KGCCA yield multi-convex optimization problems solved via block coordinate descent, HSIC-SGCCA introduces a necessary unit-variance constraint previously ignored in SCCA-HSIC, resulting in a nonconvex, non-multiconvex problem. We efficiently address this challenge by integrating the block prox-linear method with the linearized alternating direction method of multipliers. Simulations and TCGA-BRCA data analysis demonstrate that HSIC-SGCCA outperforms competing methods in multi-view variable selection.

LGNov 7, 2024
Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation

Zhaoyang Zhang, Ziqi Chen, Qiao Liu et al.

In this paper, we propose a novel framework, the Sampling-guided Heterogeneous Graph Neural Network (SHT-GNN), to effectively tackle the challenge of missing data imputation in longitudinal studies. Unlike traditional methods, which often require extensive preprocessing to handle irregular or inconsistent missing data, our approach accommodates arbitrary missing data patterns while maintaining computational efficiency. SHT-GNN models both observations and covariates as distinct node types, connecting observation nodes at successive time points through subject-specific longitudinal subnetworks, while covariate-observation interactions are represented by attributed edges within bipartite graphs. By leveraging subject-wise mini-batch sampling and a multi-layer temporal smoothing mechanism, SHT-GNN efficiently scales to large datasets, while effectively learning node representations and imputing missing data. Extensive experiments on both synthetic and real-world datasets, including the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, demonstrate that SHT-GNN significantly outperforms existing imputation methods, even with high missing data rates. The empirical results highlight SHT-GNN's robust imputation capabilities and superior performance, particularly in the context of complex, large-scale longitudinal data.

LGNov 7, 2024
Enhancing Missing Data Imputation through Combined Bipartite Graph and Complete Directed Graph

Zhaoyang Zhang, Hongtu Zhu, Ziqi Chen et al.

In this paper, we aim to address a significant challenge in the field of missing data imputation: identifying and leveraging the interdependencies among features to enhance missing data imputation for tabular data. We introduce a novel framework named the Bipartite and Complete Directed Graph Neural Network (BCGNN). Within BCGNN, observations and features are differentiated as two distinct node types, and the values of observed features are converted into attributed edges linking them. The bipartite segment of our framework inductively learns embedding representations for nodes, efficiently utilizing the comprehensive information encapsulated in the attributed edges. In parallel, the complete directed graph segment adeptly outlines and communicates the complex interdependencies among features. When compared to contemporary leading imputation methodologies, BCGNN consistently outperforms them, achieving a noteworthy average reduction of 15% in mean absolute error for feature imputation tasks under different missing mechanisms. Our extensive experimental investigation confirms that an in-depth grasp of the interdependence structure substantially enhances the model's feature embedding ability. We also highlight the model's superior performance in label prediction tasks involving missing data, and its formidable ability to generalize to unseen data points.

APJun 19, 2024
Confidence interval estimation of mixed oil length with conditional diffusion model

Yanfeng Yang, Lihong Zhang, Ziqi Chen et al.

Accurately estimating the mixed oil length plays a big role in the economic benefit for oil pipeline network. While various proposed methods have tried to predict the mixed oil length, they often exhibit an extremely high probability (around 50\%) of underestimating it. This is attributed to their failure to consider the statistical variability inherent in the estimated length of mixed oil. To address such issues, we propose to use the conditional diffusion model to learn the distribution of the mixed oil length given pipeline features. Subsequently, we design a confidence interval estimation for the length of the mixed oil based on the pseudo-samples generated by the learned diffusion model. To our knowledge, we are the first to present an estimation scheme for confidence interval of the oil-mixing length that considers statistical variability, thereby reducing the possibility of underestimating it. When employing the upper bound of the interval as a reference for excluding the mixed oil, the probability of underestimation can be as minimal as 5\%, a substantial reduction compared to 50\%. Furthermore, utilizing the mean of the generated pseudo samples as the estimator for the mixed oil length enhances prediction accuracy by at least 10\% compared to commonly used methods.

LGJun 10, 2022
$\mathsf{G^2Retro}$ as a Two-Step Graph Generative Models for Retrosynthesis Prediction

Ziqi Chen, Oluwatosin R. Ayinde, James R. Fuchs et al.

Retrosynthesis is a procedure where a target molecule is transformed into potential reactants and thus the synthesis routes can be identified. Recently, computational approaches have been developed to accelerate the design of synthesis routes. In this paper, we develop a generative framework $\mathsf{G^2Retro}$ for one-step retrosynthesis prediction. $\mathsf{G^2Retro}$ imitates the reversed logic of synthetic reactions. It first predicts the reaction centers in the target molecules (products), identifies the synthons needed to assemble the products, and transforms these synthons into reactants. $\mathsf{G^2Retro}$ defines a comprehensive set of reaction center types, and learns from the molecular graphs of the products to predict potential reaction centers. To complete synthons into reactants, $\mathsf{G^2Retro}$ considers all the involved synthon structures and the product structures to identify the optimal completion paths, and accordingly attaches small substructures sequentially to the synthons. Here we show that $\mathsf{G^2Retro}$ is able to better predict the reactants for given products in the benchmark dataset than the state-of-the-art methods.

LGMar 8, 2021
Depth Evaluation for Metal Surface Defects by Eddy Current Testing using Deep Residual Convolutional Neural Networks

Tian Meng, Yang Tao, Ziqi Chen et al.

Eddy current testing (ECT) is an effective technique in the evaluation of the depth of metal surface defects. However, in practice, the evaluation primarily relies on the experience of an operator and is often carried out by manual inspection. In this paper, we address the challenges of automatic depth evaluation of metal surface defects by virtual of state-of-the-art deep learning (DL) techniques. The main contributions are three-fold. Firstly, a highly-integrated portable ECT device is developed, which takes advantage of an advanced field programmable gate array (Zynq-7020 system on chip) and provides fast data acquisition and in-phase/quadrature demodulation. Secondly, a dataset, termed as MDDECT, is constructed using the ECT device by human operators and made openly available. It contains 48,000 scans from 18 defects of different depths and lift-offs. Thirdly, the depth evaluation problem is formulated as a time series classification problem, and various state-of-the-art 1-d residual convolutional neural networks are trained and evaluated on the MDDECT dataset. A 38-layer 1-d ResNeXt achieves an accuracy of 93.58% in discriminating the surface defects in a stainless steel sheet. The depths of the defects vary from 0.3 mm to 2.0 mm in a resolution of 0.1 mm. In addition, results show that the trained ResNeXt1D-38 model is immune to lift-off signals.

LGDec 8, 2020
A Deep Generative Model for Molecule Optimization via One Fragment Modification

Ziqi Chen, Martin Renqiang Min, Srinivasan Parthasarathy et al.

Molecule optimization is a critical step in drug development to improve desired properties of drug candidates through chemical modification. We developed a novel deep generative model Modof over molecular graphs for molecule optimization. Modof modifies a given molecule through the prediction of a single site of disconnection at the molecule and the removal and/or addition of fragments at that site. A pipeline of multiple, identical Modof models is implemented into Modof-pipe to modify an input molecule at multiple disconnection sites. Here we show that Modof-pipe is able to retain major molecular scaffolds, allow controls over intermediate optimization steps and better constrain molecule similarities. Modof-pipe outperforms the state-of-the-art methods on benchmark datasets: without molecular similarity constraints, Modof-pipe achieves 81.2% improvement in octanol-water partition coefficient penalized by synthetic accessibility and ring size; and 51.2%, 25.6% and 9.2% improvement if the optimized molecules are at least 0.2, 0.4 and 0.6 similar to those before optimization, respectively. Modof-pipe is further enhanced into Modof-pipem to allow modifying one molecule to multiple optimized ones. Modof-pipem achieves additional performance improvement as at least 17.8% better than Modof-pipe.

QMDec 4, 2020
Ranking-based Convolutional Neural Network Models for Peptide-MHC Binding Prediction

Ziqi Chen, Martin Renqiang Min, Xia Ning

T-cell receptors can recognize foreign peptides bound to major histocompatibility complex (MHC) class-I proteins, and thus trigger the adaptive immune response. Therefore, identifying peptides that can bind to MHC class-I molecules plays a vital role in the design of peptide vaccines. Many computational methods, for example, the state-of-the-art allele-specific method MHCflurry, have been developed to predict the binding affinities between peptides and MHC molecules. In this manuscript, we develop two allele-specific Convolutional Neural Network (CNN)-based methods named ConvM and SpConvM to tackle the binding prediction problem. Specifically, we formulate the problem as to optimize the rankings of peptide-MHC bindings via ranking-based learning objectives. Such optimization is more robust and tolerant to the measurement inaccuracy of binding affinities, and therefore enables more accurate prioritization of binding peptides. In addition, we develop a new position encoding method in ConvM and SpConvM to better identify the most important amino acids for the binding events. Our experimental results demonstrate that our models significantly outperform the state-of-the-art methods including MHCflurry with an average percentage improvement of 6.70% on AUC and 17.10% on ROC5 across 128 alleles.

LGJun 5, 2020
mFI-PSO: A Flexible and Effective Method in Adversarial Image Generation for Deep Neural Networks

Hai Shu, Ronghua Shi, Qiran Jia et al.

Deep neural networks (DNNs) have achieved great success in image classification, but can be very vulnerable to adversarial attacks with small perturbations to images. To improve adversarial image generation for DNNs, we develop a novel method, called mFI-PSO, which utilizes a Manifold-based First-order Influence measure for vulnerable image and pixel selection and the Particle Swarm Optimization for various objective functions. Our mFI-PSO can thus effectively design adversarial images with flexible, customized options on the number of perturbed pixels, the misclassification probability, and the targeted incorrect class. Experiments demonstrate the flexibility and effectiveness of our mFI-PSO in adversarial attacks and its appealing advantages over some popular methods.