Peiyu Duan

CV
h-index23
7papers
8citations
Novelty51%
AI Score50

7 Papers

85.8LGMay 27
Learning Robust and Task-Invariant Functional Representation from fMRI through Siamese Self-Supervised Learning

Jiyao Wang, Peiyu Duan, Nicha C. Dvornek et al.

Functional magnetic resonance imaging (fMRI) is a powerful tool for investigating human brain function. However, the high cost of data acquisition and the inherent subjectivity of psychiatric rating scales often lead to datasets with small sample sizes and variable label quality, especially when targeting a specific neurological condition. Combined with the inherently high dimensionality of fMRI data, these limitations substantially increase the risk of model overfitting. Recent years have seen growing interest in developing fMRI foundation models by combining multiple datasets; however, the computational resources needed for pretraining and fine-tuning are often prohibitive. We show that a lightweight self-supervised framework yields representations that generalize across diverse downstream tasks, outperforming fully supervised baselines and approaching the performance of large-scale models. We introduce BrainSimSiam, a data-efficient self-supervised representation learning framework that leverages positive-only data pairs to learn robust and generalizable features. We demonstrate that the learned representations achieve strong performance across multiple downstream classification and regression tasks, highlighting the potential of BrainSimSiam for data-limited neuroimaging applications.

50.1CVMay 25Code
BioFact-MoE: Biologically Factorized Mixture of Experts for Vision-Language Prognostic Modeling in Hepatocellular Carcinoma

Junlin Yang, Tian Yu, Nicha C. Dvornek et al.

Hepatocellular carcinoma (HCC) is biologically heterogeneous, shaped by the interplay between hepatic functional reserve and tumor-related oncologic factors; thus, similar survival outcomes may reflect fundamentally different underlying biological processes. Prognostic modeling in HCC is informed by rich multimodal information from multiparametric MRI and radiology reports from routine clinical practice. Existing prognostic vision-language models (VLMs) learn a single entangled latent representation that blends hepatic and tumor-related factors, limiting both accuracy and biological interpretability. We present BioFact-MoE, a biologically factorized Mixture of Experts (MoE) framework that explicitly decomposes liver and tumor factors via biologically supervised experts within a residual MoE survival architecture. On a HCC cohort of N=588 patients (pretrained on 4,582 3D MRI image-report pairs), BioFact-MoE consistently improves survival prediction over all baselines across time horizons, achieving 12-, 18-, and 24-month AUCs of 75.33%, 75.85%, and 73.96%. Beyond scalar risk prediction, gated expert weights enable phenotype-aware risk stratification. Pathway-informed gating uncovers clinically meaningful treatment-associated survival heterogeneity. In held-out validation, hepatic and tumor embeddings show selective associations with liver function and tumor burden markers, respectively (p<0.05), without supervision. The code is available at https://github.com/jy-639/BioFact-MoE.

CVSep 18, 2024Code
Self-Supervised Pre-training Tasks for an fMRI Time-series Transformer in Autism Detection

Yinchi Zhou, Peiyu Duan, Yuexi Du et al.

Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that encompasses a wide variety of symptoms and degrees of impairment, which makes the diagnosis and treatment challenging. Functional magnetic resonance imaging (fMRI) has been extensively used to study brain activity in ASD, and machine learning methods have been applied to analyze resting state fMRI (rs-fMRI) data. However, fewer studies have explored the recent transformer-based models on rs-fMRI data. Given the superiority of transformer models in capturing long-range dependencies in sequence data, we have developed a transformer-based self-supervised framework that directly analyzes time-series fMRI data without computing functional connectivity. To address over-fitting in small datasets and enhance the model performance, we propose self-supervised pre-training tasks to reconstruct the randomly masked fMRI time-series data, investigating the effects of various masking strategies. We then finetune the model for the ASD classification task and evaluate it using two public datasets and five-fold cross-validation with different amounts of training data. The experiments show that randomly masking entire ROIs gives better model performance than randomly masking time points in the pre-training step, resulting in an average improvement of 10.8% for AUC and 9.3% for subject accuracy compared with the transformer model trained from scratch across different levels of training data availability. Our code is available on GitHub.

48.7LGMay 26
FM-fMRI: Event Conditioned Flow Matching for Rest-to-Task fMRI Time-Series Synthesis

Peiyu Duan, Jiyao Wang, Nicha C. Dvornek et al.

Task-based fMRI provides a direct readout of task-evoked neural dynamics, but it is expensive and difficult to acquire at scale, motivating rest-to-task synthesis from widely available resting-state fMRI (rsfMRI). We propose FM-fMRI, an event-conditioned flow-matching model that learns a continuous-time conditional vector field to generate task ROI time series from a subject's rsfMRI and the task event information. The formulation enables fast ODE-based sampling and flexible conditioning over heterogeneous event schedules. Rather than optimizing for pointwise reconstruction, we evaluated generated signals using complementary criteria that probe temporal and spectral structure, subject and group-level connectome consistency, and distributional alignment. On the public Human Connectome Project and internal BioPoint autism cohort, FM-fMRI achieves the strongest spectral and connectivity agreement and improved distribution-level matching over conditional diffusion, generative adversarial networks (GANs), and variational autoencoders (VAEs) baselines. Furthermore, we augment the BioPoint cohort by synthesizing task-fMRI ROI time series with our method, improving downstream autism classification and demonstrating practical utility in data-limited clinical settings. The code will be available on GitHub.

15.3CVMay 8
AGA3DNet: Anatomy-Guided Gaussian Priors with Multi-view xLSTM for 3D Brain MRI Subtype Classification

Peiyu Duan, Xueqi Guo, Sepehr Farhand et al.

Accurate 3D brain MRI subtype classification benefits from both localized anatomical cues and long-range contextual reasoning. We present AGA3DNet, a report-grounded framework that incorporates brief anatomical phrases extracted from radiology reports as a soft anatomical prior channel and fuses it with a lightweight 3D CNN and multi-view xLSTM aggregation. Specifically, extracted anatomical phrases are mapped to atlas-defined regions and converted into smooth spatial priors using a signed-distance transform followed by Gaussian weighting, providing interpretable, anatomy-grounded guidance without requiring dense voxel annotations. We evaluate AGA3DNet on a retrospective institutional brain MRI cohort for abnormal subtype discrimination and compare against reproducible 3D classification baselines. AGA3DNet achieves improved overall balance across performance metrics and supports clinically interpretable localization through the prior channel. We discuss limitations related to single-cohort evaluation and the lack of large-scale public brain MRI datasets paired with radiology reports under broadly usable terms.

IVFeb 15, 2025
Towards Zero-Shot Task-Generalizable Learning on fMRI

Jiyao Wang, Nicha C. Dvornek, Peiyu Duan et al.

Functional MRI measuring BOLD signal is an increasingly important imaging modality in studying brain functions and neurological disorders. It can be acquired in either a resting-state or a task-based paradigm. Compared to resting-state fMRI, task-based fMRI is acquired while the subject is performing a specific task designed to enhance study-related brain activities. Consequently, it generally has more informative task-dependent signals. However, due to the variety of task designs, it is much more difficult than in resting state to aggregate task-based fMRI acquired in different tasks to train a generalizable model. To resolve this complication, we propose a supervised task-aware network TA-GAT that jointly learns a general-purpose encoder and task-specific contextual information. The encoder-generated embedding and the learned contextual information are then combined as input to multiple modules for performing downstream tasks. We believe that the proposed task-aware architecture can plug-and-play in any neural network architecture to incorporate the prior knowledge of fMRI tasks into capturing functional brain patterns.

LGJun 17, 2024
STNAGNN: Data-driven Spatio-temporal Brain Connectivity beyond FC

Jiyao Wang, Nicha C. Dvornek, Peiyu Duan et al.

In recent years, graph neural networks (GNNs) have been widely applied in the analysis of brain fMRI, yet defining the connectivity between ROIs remains a challenge in noisy fMRI data. Among all approaches, Functional Connectome (FC) is the most popular method. Computed by the correlation coefficients between ROI time series, FC is a powerful and computationally efficient way to estimate ROI connectivity. However, it is well known for neglecting structural connections and causality in ROI interactions. Also, FC becomes much more noisy in the short spatio-temporal sliding-window subsequences of fMRI. Effective Connectome (EC) is proposed as a directional alternative, but is difficult to accurately estimate. Furthermore, for optimal GNN performance, usually only a small percentage of the strongest connections are selected as sparse edges, resulting in oversimplification of complex brain connections. To tackle these challenges, we propose the Spatio-Temporal Node Attention Graph Neural Network (STNAGNN) as a data-driven alternative that combines sparse predefined FC with dense data-driven spatio-temporal connections, allowing for flexible and spatio-temporal learning of ROI interaction patterns.