Jun-En Ding

CL
h-index13
22papers
78citations
Novelty47%
AI Score54

22 Papers

82.4CLMay 8Code
MedAction: Towards Active Multi-turn Clinical Diagnostic LLMs

Hsin-Ling Hsu, Zizheng Wang, Donghua Zhang et al.

Most existing LLM diagnoses are evaluated on static, single-turn settings where complete patient information is provided upfront, an oversimplification of real clinical practice. We study active diagnosis: the real-life clinical process of starting from initial observation, ordering tests, interpreting results, and updating a differential diagnosis across multiple turns. Through systematic analysis, we identify three recurring failure modes in current LLMs: ungrounded test ordering, unreliable diagnostic update, and degraded multi-turn coherence. Together, these failures reveal a core deficit: existing medical training data teaches models to reason from complete information but not to act under evolving, partial evidence. To address this gap, we introduce MedAction, a tree-structured distillation pipeline that synthesizes diverse and high-quality multi-turn diagnostic trajectories via LLM-environment interaction. We propose two knowledge-graph-grounded metrics to filter trajectory quality: Disease Trajectory Consistency (DTC), which tracks whether the model's hypothesis converges toward the correct diagnosis, and Reasoning-Action Consistency (RAC), which verifies that belief updates are driven by gathered evidence. Using this pipeline, we construct MedAction-32K, a dataset of 32,681 trajectories from 2,896 PMC cases. Fine-tuning an 8B model on MedAction-32K achieves state-of-the-art performance among open-source models on both MedR-Bench and our curated MedAction-300-Hard benchmark, pushing the edge for open-source medical LLMs.

LGDec 31, 2022
Hospital transfer risk prediction for COVID-19 patients from a medicalized hotel based on Diffusion GraphSAGE

Jun-En Ding, Chih-Ho Hsu, Kuan-Chia Ling et al.

The global COVID-19 pandemic has caused more than six million deaths worldwide. Medicalized hotels were established in Taiwan as quarantine facilities for COVID-19 patients with no or mild symptoms. Due to limited medical care available at these hotels, it is of paramount importance to identify patients at risk of clinical deterioration. This study aimed to develop and evaluate a graph-based deep learning approach for progressive hospital transfer risk prediction in a medicalized hotel setting. Vital sign measurements were obtained for 632 patients and daily patient similarity graphs were constructed. Inductive graph convolutional network models were trained on top of the temporally integrated graphs to predict hospital transfer risk. The proposed models achieved AUC scores above 0.83 for hospital transfer risk prediction based on the measurements of past 1, 2, and 3 days, outperforming baseline machine learning methods. A post-hoc analysis on the constructed diffusion-based graph using Local Clustering Coefficient discovered a high-risk cluster with significantly older mean age, higher body temperature, lower SpO2, and shorter length of stay. Further time-to-hospital-transfer survival analysis also revealed a significant decrease in survival probability in the discovered high-risk cluster. The obtained results demonstrated promising predictability and interpretability of the proposed graph-based approach. This technique may help preemptively detect high-risk patients at community-based medical facilities similar to a medicalized hotel.

CLJul 17, 2024
MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models

Thao Minh Nguyen Phan, Cong-Tinh Dao, Chenwei Wu et al.

Electronic health records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to get a clearer picture of patients' health and support clinical decision-making. However, most EHR predictive models do not reflect these procedures, as they either focus on a single modality or overlook the inter-modality interactions/redundancy. In this work, we propose MEDFuse, a Multimodal EHR Data Fusion framework that incorporates masked lab-test modeling and large language models (LLMs) to effectively integrate structured and unstructured medical data. MEDFuse leverages multimodal embeddings extracted from two sources: LLMs fine-tuned on free clinical text and masked tabular transformers trained on structured lab test results. We design a disentangled transformer module, optimized by a mutual information loss to 1) decouple modality-specific and modality-shared information and 2) extract useful joint representation from the noise and redundancy present in clinical notes. Through comprehensive validation on the public MIMIC-III dataset and the in-house FEMH dataset, MEDFuse demonstrates great potential in advancing clinical predictions, achieving over 90% F1 score in the 10-disease multi-label classification task.

CVNov 25, 2023
Parkinson's Disease Classification Using Contrastive Graph Cross-View Learning with Multimodal Fusion of SPECT Images and Clinical Features

Jun-En Ding, Chien-Chin Hsu, Feng Liu

Parkinson's Disease (PD) affects millions globally, impacting movement. Prior research utilized deep learning for PD prediction, primarily focusing on medical images, neglecting the data's underlying manifold structure. This work proposes a multimodal approach encompassing both image and non-image features, leveraging contrastive cross-view graph fusion for PD classification. We introduce a novel multimodal co-attention module, integrating embeddings from separate graph views derived from low-dimensional representations of images and clinical features. This enables more robust and structured feature extraction for improved multi-view data analysis. Additionally, a simplified contrastive loss-based fusion method is devised to enhance cross-view fusion learning. Our graph-view multimodal approach achieves an accuracy of 0.91 and an area under the receiver operating characteristic curve (AUC) of 0.93 in five-fold cross-validation. It also demonstrates superior predictive capabilities on non-image data compared to solely machine learning-based methods.

AINov 14, 2025
CURENet: Combining Unified Representations for Efficient Chronic Disease Prediction

Cong-Tinh Dao, Nguyen Minh Thao Phan, Jun-En Ding et al.

Electronic health records (EHRs) are designed to synthesize diverse data types, including unstructured clinical notes, structured lab tests, and time-series visit data. Physicians draw on these multimodal and temporal sources of EHR data to form a comprehensive view of a patient's health, which is crucial for informed therapeutic decision-making. Yet, most predictive models fail to fully capture the interactions, redundancies, and temporal patterns across multiple data modalities, often focusing on a single data type or overlooking these complexities. In this paper, we present CURENet, a multimodal model (Combining Unified Representations for Efficient chronic disease prediction) that integrates unstructured clinical notes, lab tests, and patients' time-series data by utilizing large language models (LLMs) for clinical text processing and textual lab tests, as well as transformer encoders for longitudinal sequential visits. CURENet has been capable of capturing the intricate interaction between different forms of clinical data and creating a more reliable predictive model for chronic illnesses. We evaluated CURENet using the public MIMIC-III and private FEMH datasets, where it achieved over 94\% accuracy in predicting the top 10 chronic conditions in a multi-label framework. Our findings highlight the potential of multimodal EHR integration to enhance clinical decision-making and improve patient outcomes.

SPFeb 24
NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

Guoan Wang, Shihao Yang, Jun-en Ding et al.

Electroencephalography (EEG) provides a non-invasive window into neural dynamics at high temporal resolution and plays a pivotal role in clinical neuroscience research. Despite this potential, prevailing computational approaches to EEG analysis remain largely confined to task-specific classification objectives or coarse-grained pattern recognition, offering limited support for clinically meaningful interpretation. To address these limitations, we introduce NeuroNarrator, the first generalist EEG-to-text foundation model designed to translate electrophysiological segments into precise clinical narratives. A cornerstone of this framework is the curation of NeuroCorpus-160K, the first harmonized large-scale resource pairing over 160,000 EEG segments with structured, clinically grounded natural-language descriptions. Our architecture first aligns temporal EEG waveforms with spatial topographic maps via a rigorous contrastive objective, establishing spectro-spatially grounded representations. Building on this grounding, we condition a Large Language Model through a state-space-inspired formulation that integrates historical temporal and spectral context to support coherent clinical narrative generation. This approach establishes a principled bridge between continuous signal dynamics and discrete clinical language, enabling interpretable narrative generation that facilitates expert interpretation and supports clinical reporting workflows. Extensive evaluations across diverse benchmarks and zero-shot transfer tasks highlight NeuroNarrator's capacity to integrate temporal, spectral, and spatial dynamics, positioning it as a foundational framework for time-frequency-aware, open-ended clinical interpretation of electrophysiological data.

AIFeb 13
NeuroWeaver: An Autonomous Evolutionary Agent for Exploring the Programmatic Space of EEG Analysis Pipelines

Guoan Wang, Shihao Yang, Jun-En Ding et al.

Although foundation models have demonstrated remarkable success in general domains, the application of these models to electroencephalography (EEG) analysis is constrained by substantial data requirements and high parameterization. These factors incur prohibitive computational costs, thereby impeding deployment in resource-constrained clinical environments. Conversely, general-purpose automated machine learning frameworks are often ill-suited for this domain, as exploration within an unbounded programmatic space fails to incorporate essential neurophysiological priors and frequently yields solutions that lack scientific plausibility. To address these limitations, we propose NeuroWeaver, a unified autonomous evolutionary agent designed to generalize across diverse EEG datasets and tasks by reformulating pipeline engineering as a discrete constrained optimization problem. Specifically, we employ a Domain-Informed Subspace Initialization to confine the search to neuroscientifically plausible manifolds, coupled with a Multi-Objective Evolutionary Optimization that dynamically balances performance, novelty, and efficiency via self-reflective refinement. Empirical evaluations across five heterogeneous benchmarks demonstrate that NeuroWeaver synthesizes lightweight solutions that consistently outperform state-of-the-art task-specific methods and achieve performance comparable to large-scale foundation models, despite utilizing significantly fewer parameters.

STJan 15, 2023
Sequential Graph Attention Learning for Predicting Dynamic Stock Trends (Student Abstract)

Tzu-Ya Lai, Wen Jung Cheng, Jun-En Ding

The stock market is characterized by a complex relationship between companies and the market. This study combines a sequential graph structure with attention mechanisms to learn global and local information within temporal time. Specifically, our proposed "GAT-AGNN" module compares model performance across multiple industries as well as within single industries. The results show that the proposed framework outperforms the state-of-the-art methods in predicting stock trends across multiple industries on Taiwan Stock datasets.

CLDec 18, 2024
Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs

David Restrepo, Chenwei Wu, Zhengxu Tang et al.

Current ophthalmology clinical workflows are plagued by over-referrals, long waits, and complex and heterogeneous medical records. Large language models (LLMs) present a promising solution to automate various procedures such as triaging, preliminary tests like visual acuity assessment, and report summaries. However, LLMs have demonstrated significantly varied performance across different languages in natural language question-answering tasks, potentially exacerbating healthcare disparities in Low and Middle-Income Countries (LMICs). This study introduces the first multilingual ophthalmological question-answering benchmark with manually curated questions parallel across languages, allowing for direct cross-lingual comparisons. Our evaluation of 6 popular LLMs across 7 different languages reveals substantial bias across different languages, highlighting risks for clinical deployment of LLMs in LMICs. Existing debiasing methods such as Translation Chain-of-Thought or Retrieval-augmented generation (RAG) by themselves fall short of closing this performance gap, often failing to improve performance across all languages and lacking specificity for the medical domain. To address this issue, We propose CLARA (Cross-Lingual Reflective Agentic system), a novel inference time de-biasing method leveraging retrieval augmented generation and self-verification. Our approach not only improves performance across all languages but also significantly reduces the multilingual bias gap, facilitating equitable LLM application across the globe.

CLMar 23, 2025
MedPlan:A Two-Stage RAG-Based System for Personalized Medical Plan Generation

Hsin-Ling Hsu, Cong-Tinh Dao, Luning Wang et al.

Despite recent success in applying large language models (LLMs) to electronic health records (EHR), most systems focus primarily on assessment rather than treatment planning. We identify three critical limitations in current approaches: they generate treatment plans in a single pass rather than following the sequential reasoning process used by clinicians; they rarely incorporate patient-specific historical context; and they fail to effectively distinguish between subjective and objective clinical information. Motivated by the SOAP methodology (Subjective, Objective, Assessment, Plan), we introduce \ours{}, a novel framework that structures LLM reasoning to align with real-life clinician workflows. Our approach employs a two-stage architecture that first generates a clinical assessment based on patient symptoms and objective data, then formulates a structured treatment plan informed by this assessment and enriched with patient-specific information through retrieval-augmented generation. Comprehensive evaluation demonstrates that our method significantly outperforms baseline approaches in both assessment accuracy and treatment plan quality.

CLMar 2, 2024
Large Language Multimodal Models for 5-Year Chronic Disease Cohort Prediction Using EHR Data

Jun-En Ding, Phan Nguyen Minh Thao, Wen-Chih Peng et al.

Chronic diseases such as diabetes are the leading causes of morbidity and mortality worldwide. Numerous research studies have been attempted with various deep learning models in diagnosis. However, most previous studies had certain limitations, including using publicly available datasets (e.g. MIMIC), and imbalanced data. In this study, we collected five-year electronic health records (EHRs) from the Taiwan hospital database, including 1,420,596 clinical notes, 387,392 laboratory test results, and more than 1,505 laboratory test items, focusing on research pre-training large language models. We proposed a novel Large Language Multimodal Models (LLMMs) framework incorporating multimodal data from clinical notes and laboratory test results for the prediction of chronic disease risk. Our method combined a text embedding encoder and multi-head attention layer to learn laboratory test values, utilizing a deep neural network (DNN) module to merge blood features with chronic disease semantics into a latent space. In our experiments, we observe that clinicalBERT and PubMed-BERT, when combined with attention fusion, can achieve an accuracy of 73% in multiclass chronic diseases and diabetes prediction. By transforming laboratory test values into textual descriptions and employing the Flan T-5 model, we achieved a 76% Area Under the ROC Curve (AUROC), demonstrating the effectiveness of leveraging numerical text data for training and inference in language models. This approach significantly improves the accuracy of early-stage diabetes prediction.

AIMay 22, 2025
No Black Boxes: Interpretable and Interactable Predictive Healthcare with Knowledge-Enhanced Agentic Causal Discovery

Xiaoxue Han, Pengfei Hu, Jun-En Ding et al.

Deep learning models trained on extensive Electronic Health Records (EHR) data have achieved high accuracy in diagnosis prediction, offering the potential to assist clinicians in decision-making and treatment planning. However, these models lack two crucial features that clinicians highly value: interpretability and interactivity. The ``black-box'' nature of these models makes it difficult for clinicians to understand the reasoning behind predictions, limiting their ability to make informed decisions. Additionally, the absence of interactive mechanisms prevents clinicians from incorporating their own knowledge and experience into the decision-making process. To address these limitations, we propose II-KEA, a knowledge-enhanced agent-driven causal discovery framework that integrates personalized knowledge databases and agentic LLMs. II-KEA enhances interpretability through explicit reasoning and causal analysis, while also improving interactivity by allowing clinicians to inject their knowledge and experience through customized knowledge bases and prompts. II-KEA is evaluated on both MIMIC-III and MIMIC-IV, demonstrating superior performance along with enhanced interpretability and interactivity, as evidenced by its strong results from extensive case studies.

LGApr 2, 2025
Flexible and Explainable Graph Analysis for EEG-based Alzheimer's Disease Classification

Jing Wang, Jun-En Ding, Feng Liu et al.

Alzheimer's Disease is a progressive neurological disorder that is one of the most common forms of dementia. It leads to a decline in memory, reasoning ability, and behavior, especially in older people. The cause of Alzheimer's Disease is still under exploration and there is no all-inclusive theory that can explain the pathologies in each individual patient. Nevertheless, early intervention has been found to be effective in managing symptoms and slowing down the disease's progression. Recent research has utilized electroencephalography (EEG) data to identify biomarkers that distinguish Alzheimer's Disease patients from healthy individuals. Prior studies have used various machine learning methods, including deep learning and graph neural networks, to examine electroencephalography-based signals for identifying Alzheimer's Disease patients. In our research, we proposed a Flexible and Explainable Gated Graph Convolutional Network (GGCN) with Multi-Objective Tree-Structured Parzen Estimator (MOTPE) hyperparameter tuning. This provides a flexible solution that efficiently identifies the optimal number of GGCN blocks to achieve the optimized precision, specificity, and recall outcomes, as well as the optimized area under the Receiver Operating Characteristic (AUC). Our findings demonstrated a high efficacy with an over 0.9 Receiver Operating Characteristic score, alongside precision, specificity, and recall scores in distinguishing health control with Alzheimer's Disease patients in Moderate to Severe Dementia using the power spectrum density (PSD) of electroencephalography signals across various frequency bands. Moreover, our research enhanced the interpretability of the embedded adjacency matrices, revealing connectivity differences in frontal and parietal brain regions between Alzheimer's patients and healthy individuals.

IVOct 23, 2024
Enhancing Multimodal Medical Image Classification using Cross-Graph Modal Contrastive Learning

Jun-En Ding, Chien-Chin Hsu, Chi-Hsiang Chu et al.

The classification of medical images is a pivotal aspect of disease diagnosis, often enhanced by deep learning techniques. However, traditional approaches typically focus on unimodal medical image data, neglecting the integration of diverse non-image patient data. This paper proposes a novel Cross-Graph Modal Contrastive Learning (CGMCL) framework for multimodal structured data from different data domains to improve medical image classification. The model effectively integrates both image and non-image data by constructing cross-modality graphs and leveraging contrastive learning to align multimodal features in a shared latent space. An inter-modality feature scaling module further optimizes the representation learning process by reducing the gap between heterogeneous modalities. The proposed approach is evaluated on two datasets: a Parkinson's disease (PD) dataset and a public melanoma dataset. Results demonstrate that CGMCL outperforms conventional unimodal methods in accuracy, interpretability, and early disease prediction. Additionally, the method shows superior performance in multi-class melanoma classification. The CGMCL framework provides valuable insights into medical image classification while offering improved disease interpretability and predictive capabilities.

NCFeb 29, 2024
Spatial Craving Patterns in Marijuana Users: Insights from fMRI Brain Connectivity Analysis with High-Order Graph Attention Neural Networks

Jun-En Ding, Shihao Yang, Anna Zilverstand et al.

The excessive consumption of marijuana can induce substantial psychological and social consequences. In this investigation, we propose an elucidative framework termed high-order graph attention neural networks (HOGANN) for the classification of Marijuana addiction, coupled with an analysis of localized brain network communities exhibiting abnormal activities among chronic marijuana users. HOGANN integrates dynamic intrinsic functional brain networks, estimated from functional magnetic resonance imaging (fMRI), using graph attention-based long short-term memory (GAT-LSTM) to capture temporal network dynamics. We employ a high-order attention module for information fusion and message passing among neighboring nodes, enhancing the network community analysis. Our model is validated across two distinct data cohorts, yielding substantially higher classification accuracy than benchmark algorithms. Furthermore, we discern the most pertinent subnetworks and cognitive regions affected by persistent marijuana consumption, indicating adverse effects on functional brain networks, particularly within the dorsal attention and frontoparietal networks. Intriguingly, our model demonstrates superior performance in cohorts exhibiting prolonged dependence, implying that prolonged marijuana usage induces more pronounced alterations in brain networks. The model proficiently identifies craving brain maps, thereby delineating critical brain regions for analysis

CLFeb 15
Chain-of-Thought Reasoning with Large Language Models for Clinical Alzheimer's Disease Assessment and Diagnosis

Tongze Zhang, Jun-En Ding, Melik Ozolcer et al.

Alzheimer's disease (AD) has become a prevalent neurodegenerative disease worldwide. Traditional diagnosis still relies heavily on medical imaging and clinical assessment by physicians, which is often time-consuming and resource-intensive in terms of both human expertise and healthcare resources. In recent years, large language models (LLMs) have been increasingly applied to the medical field using electronic health records (EHRs), yet their application in Alzheimer's disease assessment remains limited, particularly given that AD involves complex multifactorial etiologies that are difficult to observe directly through imaging modalities. In this work, we propose leveraging LLMs to perform Chain-of-Thought (CoT) reasoning on patients' clinical EHRs. Unlike direct fine-tuning of LLMs on EHR data for AD classification, our approach utilizes LLM-generated CoT reasoning paths to provide the model with explicit diagnostic rationale for AD assessment, followed by structured CoT-based predictions. This pipeline not only enhances the model's ability to diagnose intrinsically complex factors but also improves the interpretability of the prediction process across different stages of AD progression. Experimental results demonstrate that the proposed CoT-based diagnostic framework significantly enhances stability and diagnostic performance across multiple CDR grading tasks, achieving up to a 15% improvement in F1 score compared to the zero-shot baseline method.

AIOct 18, 2025
Foundation and Large-Scale AI Models in Neuroscience: A Comprehensive Review

Shihao Yang, Xiying Huang, Danilo Bernardo et al.

The advent of large-scale artificial intelligence (AI) models has a transformative effect on neuroscience research, which represents a paradigm shift from the traditional computational methods through the facilitation of end-to-end learning from raw brain signals and neural data. In this paper, we explore the transformative effects of large-scale AI models on five major neuroscience domains: neuroimaging and data processing, brain-computer interfaces and neural decoding, molecular neuroscience and genomic modeling, clinical assistance and translational frameworks, and disease-specific applications across neurological and psychiatric disorders. These models are demonstrated to address major computational neuroscience challenges, including multimodal neural data integration, spatiotemporal pattern interpretation, and the derivation of translational frameworks for clinical deployment. Moreover, the interaction between neuroscience and AI has become increasingly reciprocal, as biologically informed architectural constraints are now incorporated to develop more interpretable and computationally efficient models. This review highlights both the notable promise of such technologies and key implementation considerations, with particular emphasis on rigorous evaluation frameworks, effective domain knowledge integration, and comprehensive ethical guidelines for clinical use. Finally, a systematic listing of critical neuroscience datasets used to derive and validate large-scale AI models across diverse research applications is provided.

LGOct 13, 2025
Variational Mixture of Graph Neural Experts for Alzheimer's Disease Biomarker Recognition in EEG Brain Networks

Jun-En Ding, Anna Zilverstand, Shihao Yang et al.

Dementia disorders such as Alzheimer's disease (AD) and frontotemporal dementia (FTD) exhibit overlapping electrophysiological signatures in EEG that challenge accurate diagnosis. Existing EEG-based methods are limited by full-band frequency analysis that hinders precise differentiation of dementia subtypes and severity stages. We propose a variational mixture of graph neural experts (VMoGE) that integrates frequency-specific biomarker identification with structured variational inference for enhanced dementia diagnosis and staging. VMoGE employs a multi-granularity transformer to extract multi-scale temporal patterns across four frequency bands, followed by a variational graph convolutional encoder using Gaussian Markov Random Field priors. Through structured variational inference and adaptive gating, VMoGE links neural specialization to physiologically meaningful EEG frequency bands. Evaluated on two diverse datasets for both subtype classification and severity staging, VMoGE achieves superior performance with AUC improvements of +4% to +10% over state-of-the-art methods. Moreover, VMoGE provides interpretable insights through expert weights that correlate with clinical indicators and spatial patterns aligned with neuropathological signatures, facilitating EEG biomarker discovery for comprehensive dementia diagnosis and monitoring.

CLAug 31, 2025
RPRO: Ranked Preference Reinforcement Optimization for Enhancing Medical QA and Diagnostic Reasoning

Chia-Hsuan Hsu, Jun-En Ding, Hsin-Ling Hsu et al.

Medical question answering requires advanced reasoning that integrates domain knowledge with logical inference. However, existing large language models (LLMs) often generate reasoning chains that lack factual accuracy and clinical reliability. We propose Ranked Preference Reinforcement Optimization (RPRO), a novel framework that combines reinforcement learning with preference-driven reasoning refinement to enhance clinical chain-of-thought (CoT) performance. RPRO distinguishes itself from prior approaches by employing task-adaptive reasoning templates and a probabilistic evaluation mechanism that aligns model outputs with established clinical workflows, while automatically identifying and correcting low-quality reasoning chains. Unlike traditional pairwise preference methods, RPRO introduces a groupwise ranking optimization based on the Bradley--Terry model and incorporates KL-divergence regularization for stable training. Experiments on PubMedQA, MedQA-USMLE, and a real-world clinical dataset from Far Eastern Memorial Hospital (FEMH) demonstrate consistent improvements over strong baselines. Remarkably, our 2B-parameter model outperforms much larger 7B--20B models, including medical-specialized variants. These findings demonstrate that combining preference optimization with quality-driven refinement provides a scalable and clinically grounded approach to building more reliable medical LLMs.

NEFeb 26, 2025
NeuroTree: Hierarchical Functional Brain Pathway Decoding for Mental Health Disorders

Jun-En Ding, Dongsheng Luo, Anna Zilverstand et al.

Mental disorders are among the most widespread diseases globally. Analyzing functional brain networks through functional magnetic resonance imaging (fMRI) is crucial for understanding mental disorder behaviors. Although existing fMRI-based graph neural networks (GNNs) have demonstrated significant potential in brain network feature extraction, they often fail to characterize complex relationships between brain regions and demographic information in mental disorders. To overcome these limitations, we propose a learnable NeuroTree framework that integrates a k-hop AGE-GCN with neural ordinary differential equations (ODEs) and contrastive masked functional connectivity (CMFC) to enhance similarities and dissimilarities of brain region distance. Furthermore, NeuroTree effectively decodes fMRI network features into tree structures, which improves the capture of high-order brain regional pathway features and enables the identification of hierarchical neural behavioral patterns essential for understanding disease-related brain subnetworks. Our empirical evaluations demonstrate that NeuroTree achieves state-of-the-art performance across two distinct mental disorder datasets. It provides valuable insights into age-related deterioration patterns, elucidating their underlying neural mechanisms.

SEJun 26, 2024
EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models

Chun-Chieh Liao, Wei-Ting Kuo, I-Hsuan Hu et al.

Traditional diagnosis of chronic diseases involves in-person consultations with physicians to identify the disease. However, there is a lack of research focused on predicting and developing application systems using clinical notes and blood test values. We collected five years of Electronic Health Records (EHRs) from Taiwan's hospital database between 2017 and 2021 as an AI database. Furthermore, we developed an EHR-based chronic disease prediction platform utilizing Large Language Multimodal Models (LLMMs), successfully integrating with frontend web and mobile applications for prediction. This prediction platform can also connect to the hospital's backend database, providing physicians with real-time risk assessment diagnostics. The demonstration link can be found at https://www.youtube.com/watch?v=oqmL9DEDFgA.

CVApr 6, 2021
Dopamine Transporter SPECT Image Classification for Neurodegenerative Parkinsonism via Diffusion Maps and Machine Learning Classifiers

Jun-En Ding, Chi-Hsiang Chu, Mong-Na Lo Huang et al.

Neurodegenerative parkinsonism can be assessed by dopamine transporter single photon emission computed tomography (DaT-SPECT). Although generating images is time consuming, these images can show interobserver variability and they have been visually interpreted by nuclear medicine physicians to date. Accordingly, this study aims to provide an automatic and robust method based on Diffusion Maps and machine learning classifiers to classify the SPECT images into two types, namely Normal and Abnormal DaT-SPECT image groups. In the proposed method, the 3D images of N patients are mapped to an N by N pairwise distance matrix and are visualized in Diffusion Maps coordinates. The images of the training set are embedded into a low-dimensional space by using diffusion maps. Moreover, we use Nyström's out-of-sample extension, which embeds new sample points as the testing set in the reduced space. Testing samples in the embedded space are then classified into two types through the ensemble classifier with Linear Discriminant Analysis (LDA) and voting procedure through twenty-five-fold cross-validation results. The feasibility of the method is demonstrated via Parkinsonism Progression Markers Initiative (PPMI) dataset of 1097 subjects and a clinical cohort from Kaohsiung Chang Gung Memorial Hospital (KCGMH-TW) of 630 patients. We compare performances using Diffusion Maps with those of three alternative manifold methods for dimension reduction, namely Locally Linear Embedding (LLE), Isomorphic Mapping Algorithm (Isomap), and Kernel Principal Component Analysis (Kernel PCA). We also compare results using 2D and 3D CNN methods. The diffusion maps method has an average accuracy of 98% for the PPMI and 90% for the KCGMH-TW dataset with twenty-five fold cross-validation results. It outperforms the other three methods concerning the overall accuracy and the robustness in the training and testing samples.