LGOct 3, 2022
A New Hip Fracture Risk Index Derived from FEA-Computed Proximal Femur Fracture Loads and Energies-to-FailureXuewei Cao, Joyce H Keyak, Sigurdur Sigurdsson et al.
Hip fracture risk assessment is an important but challenging task. Quantitative CT-based patient specific finite element analysis (FEA) computes the force (fracture load) to break the proximal femur in a particular loading condition. It provides different structural information about the proximal femur that can influence a subject overall fracture risk. To obtain a more robust measure of fracture risk, we used principal component analysis (PCA) to develop a global FEA computed fracture risk index that incorporates the FEA-computed yield and ultimate failure loads and energies to failure in four loading conditions (single-limb stance and impact from a fall onto the posterior, posterolateral, and lateral aspects of the greater trochanter) of 110 hip fracture subjects and 235 age and sex matched control subjects from the AGES-Reykjavik study. We found that the first PC (PC1) of the FE parameters was the only significant predictor of hip fracture. Using a logistic regression model, we determined if prediction performance for hip fracture using PC1 differed from that using FE parameters combined by stratified random resampling with respect to hip fracture status. The results showed that the average of the area under the receive operating characteristic curve (AUC) using PC1 was always higher than that using all FE parameters combined in the male subjects. The AUC of PC1 and AUC of the FE parameters combined were not significantly different than that in the female subjects or in all subjects
QMMay 29
DXA-Derived Skeletal Phenotypes and Hip Fracture Risk: A Backdoor-Adjusted Causal AnalysisZixin Shi, Chen Zhao, Meiling Zhou et al.
Purpose: To compare dual-energy X-ray absorptiometry (DXA)-derived hip skeletal phenotypes in relation to hip fracture risk using prespecified confounder adjustment and to assess whether phenotypes ranked by their backdoor-adjusted average treatment effects (ATEs) improve risk stratification. Methods: We analyzed 21,098 UK Biobank participants with linked health records, hip DXA-derived skeletal measures, and prespecified covariates. Sixteen phenotypes spanning bone mineral content (BMC), bone mineral density (BMD), and T-score across hip-related regions were evaluated. Confounder selection was guided by a prespecified directed acyclic graph (DAG). Backdoor-adjusted ATEs were estimated on the absolute risk-difference scale per standard deviation (SD) increase. Effect heterogeneity was evaluated for total femur BMD, and downstream prediction was assessed using clinical variables combined with phenotypes ranked by ATE magnitude. Results: Among 21,098 participants, 115 had hip fractures. All 16 phenotypes showed negative backdoor-adjusted ATEs per SD increase. The largest ATEs were observed for total femur BMC and total femur BMD, each with a risk difference of -0.0047, corresponding to approximately 4.7 fewer hip fractures per 1,000 participants per SD higher phenotype value. Conditional effects of total femur BMD were stronger among older participants and those with lower BMI. In prediction, clinical variables plus the top 11 ATE-ranked phenotypes achieved higher AUC than FRAX with femoral neck BMD (0.842 vs. 0.709), with higher sensitivity (0.748 vs. 0.443) and similar specificity (0.793 vs. 0.777). Conclusion: DXA-derived hip skeletal phenotypes differed in their backdoor-adjusted ATEs. Phenotype-level causal evaluation may help identify informative DXA measures for risk stratification.
LGApr 12, 2023
CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics dataChen Zhao, Anqi Liu, Xiao Zhang et al.
Integration of heterogeneous and high-dimensional multi-omics data is becoming increasingly important in understanding genetic data. Each omics technique only provides a limited view of the underlying biological process and integrating heterogeneous omics layers simultaneously would lead to a more comprehensive and detailed understanding of diseases and phenotypes. However, one obstacle faced when performing multi-omics data integration is the existence of unpaired multi-omics data due to instrument sensitivity and cost. Studies may fail if certain aspects of the subjects are missing or incomplete. In this paper, we propose a deep learning method for multi-omics integration with incomplete data by Cross-omics Linked unified embedding with Contrastive Learning and Self Attention (CLCLSA). Utilizing complete multi-omics data as supervision, the model employs cross-omics autoencoders to learn the feature representation across different types of biological data. The multi-omics contrastive learning, which is used to maximize the mutual information between different types of omics, is employed before latent feature concatenation. In addition, the feature-level self-attention and omics-level self-attention are employed to dynamically identify the most informative features for multi-omics data integration. Extensive experiments were conducted on four public multi-omics datasets. The experimental results indicated that the proposed CLCLSA outperformed the state-of-the-art approaches for multi-omics data classification using incomplete multi-omics data.
CVJan 11, 2023
AGMN: Association Graph-based Graph Matching Network for Coronary Artery Semantic Labeling on Invasive Coronary AngiogramsChen Zhao, Zhihui Xu, Jingfeng Jiang et al.
Semantic labeling of coronary arterial segments in invasive coronary angiography (ICA) is important for automated assessment and report generation of coronary artery stenosis in the computer-aided diagnosis of coronary artery disease (CAD). Inspired by the training procedure of interventional cardiologists for interpreting the structure of coronary arteries, we propose an association graph-based graph matching network (AGMN) for coronary arterial semantic labeling. We first extract the vascular tree from invasive coronary angiography (ICA) and convert it into multiple individual graphs. Then, an association graph is constructed from two individual graphs where each vertex represents the relationship between two arterial segments. Using the association graph, the AGMN extracts the vertex features by the embedding module, aggregates the features from adjacent vertices and edges by graph convolution network, and decodes the features to generate the semantic mappings between arteries. By learning the mapping of arterial branches between two individual graphs, the unlabeled arterial segments are classified by the labeled segments to achieve semantic labeling. A dataset containing 263 ICAs was employed to train and validate the proposed model, and a five-fold cross-validation scheme was performed. Our AGMN model achieved an average accuracy of 0.8264, an average precision of 0.8276, an average recall of 0.8264, and an average F1-score of 0.8262, which significantly outperformed existing coronary artery semantic labeling methods. In conclusion, we have developed and validated a new algorithm with high accuracy, interpretability, and robustness for coronary artery semantic labeling on ICAs.
LGSep 15, 2023
A new method of modeling the multi-stage decision-making process of CRT using machine learning with uncertainty quantificationKristoffer Larsen, Chen Zhao, Joyce Keyak et al.
Aims. The purpose of this study is to create a multi-stage machine learning model to predict cardiac resynchronization therapy (CRT) response for heart failure (HF) patients. This model exploits uncertainty quantification to recommend additional collection of single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI) variables if baseline clinical variables and features from electrocardiogram (ECG) are not sufficient. Methods. 218 patients who underwent rest-gated SPECT MPI were enrolled in this study. CRT response was defined as an increase in left ventricular ejection fraction (LVEF) > 5% at a 6+-1 month follow-up. A multi-stage ML model was created by combining two ensemble models: Ensemble 1 was trained with clinical variables and ECG; Ensemble 2 included Ensemble 1 plus SPECT MPI features. Uncertainty quantification from Ensemble 1 allowed for multi-stage decision-making to determine if the acquisition of SPECT data for a patient is necessary. The performance of the multi-stage model was compared with that of Ensemble models 1 and 2. Results. The response rate for CRT was 55.5% (n = 121) with overall male gender 61.0% (n = 133), an average age of 62.0+-11.8, and LVEF of 27.7+-11.0. The multi-stage model performed similarly to Ensemble 2 (which utilized the additional SPECT data) with AUC of 0.75 vs. 0.77, accuracy of 0.71 vs. 0.69, sensitivity of 0.70 vs. 0.72, and specificity 0.72 vs. 0.65, respectively. However, the multi-stage model only required SPECT MPI data for 52.7% of the patients across all folds. Conclusions. By using rule-based logic stemming from uncertainty quantification, the multi-stage model was able to reduce the need for additional SPECT MPI data acquisition without sacrificing performance.
LGOct 3, 2022
Multi-view information fusion using multi-view variational autoencoders to predict proximal femoral strengthChen Zhao, Joyce H Keyak, Xuewei Cao et al.
The aim of this paper is to design a deep learning-based model to predict proximal femoral strength using multi-view information fusion. Method: We developed new models using multi-view variational autoencoder (MVAE) for feature representation learning and a product of expert (PoE) model for multi-view information fusion. We applied the proposed models to an in-house Louisiana Osteoporosis Study (LOS) cohort with 931 male subjects, including 345 African Americans and 586 Caucasians. With an analytical solution of the product of Gaussian distribution, we adopted variational inference to train the designed MVAE-PoE model to perform common latent feature extraction. We performed genome-wide association studies (GWAS) to select 256 genetic variants with the lowest p-values for each proximal femoral strength and integrated whole genome sequence (WGS) features and DXA-derived imaging features to predict proximal femoral strength. Results: The best prediction model for fall fracture load was acquired by integrating WGS features and DXA-derived imaging features. The designed models achieved the mean absolute percentage error of 18.04%, 6.84% and 7.95% for predicting proximal femoral fracture loads using linear models of fall loading, nonlinear models of fall loading, and nonlinear models of stance loading, respectively. Compared to existing multi-view information fusion methods, the proposed MVAE-PoE achieved the best performance. Conclusion: The proposed models are capable of predicting proximal femoral strength using WGS features and DXA-derived imaging features. Though this tool is not a substitute for FEA using QCT images, it would make improved assessment of hip fracture risk more widely available while avoiding the increased radiation dosage and clinical costs from QCT.
IVJun 7, 2022
A new method incorporating deep learning with shape priors for left ventricular segmentation in myocardial perfusion SPECT imagesFubao Zhu, Jinyu Zhao, Chen Zhao et al.
Background: The assessment of left ventricular (LV) function by myocardial perfusion SPECT (MPS) relies on accurate myocardial segmentation. The purpose of this paper is to develop and validate a new method incorporating deep learning with shape priors to accurately extract the LV myocardium for automatic measurement of LV functional parameters. Methods: A segmentation architecture that integrates a three-dimensional (3D) V-Net with a shape deformation module was developed. Using the shape priors generated by a dynamic programming (DP) algorithm, the model output was then constrained and guided during the model training for quick convergence and improved performance. A stratified 5-fold cross-validation was used to train and validate our models. Results: Results of our proposed method agree well with those from the ground truth. Our proposed model achieved a Dice similarity coefficient (DSC) of 0.9573(0.0244), 0.9821(0.0137), and 0.9903(0.0041), a Hausdorff distances (HD) of 6.7529(2.7334) mm, 7.2507(3.1952) mm, and 7.6121(3.0134) mm in extracting the endocardium, myocardium, and epicardium, respectively. Conclusion: Our proposed method achieved a high accuracy in extracting LV myocardial contours and assessing LV function.
IVJun 24, 2022
Automatic extraction of coronary arteries using deep learning in invasive coronary angiogramsYinghui Meng, Zhenglong Du, Chen Zhao et al.
Accurate extraction of coronary arteries from invasive coronary angiography (ICA) is important in clinical decision-making for the diagnosis and risk stratification of coronary artery disease (CAD). In this study, we develop a method using deep learning to automatically extract the coronary artery lumen. Methods. A deep learning model U-Net 3+, which incorporates the full-scale skip connections and deep supervisions, was proposed for automatic extraction of coronary arteries from ICAs. Transfer learning and a hybrid loss function were employed in this novel coronary artery extraction framework. Results. A data set containing 616 ICAs obtained from 210 patients was used. In the technical evaluation, the U-Net 3+ achieved a Dice score of 0.8942 and a sensitivity of 0.8735, which is higher than U-Net ++ (Dice score: 0.8814, the sensitivity of 0.8331) and U-net (Dice score: 0.8799, the sensitivity of 0.8305). Conclusion. Our study demonstrates that the U-Net 3+ is superior to other segmentation frameworks for the automatic extraction of the coronary arteries from ICAs. This result suggests great promise for clinical use.
MED-PHJul 3, 2023
Automated identification and quantification of myocardial inflammatory infiltration in digital histological images to diagnose myocarditisYanyun Liu, Xiumeng Hua, Shouping Zhu et al.
This study aims to develop a new computational pathology approach that automates the identification and quantification of myocardial inflammatory infiltration in digital HE-stained images to provide a quantitative histological diagnosis of myocarditis.898 HE-stained whole slide images (WSIs) of myocardium from 154 heart transplant patients diagnosed with myocarditis or dilated cardiomyopathy (DCM) were included in this study. An automated DL-based computational pathology approach was developed to identify nuclei and detect myocardial inflammatory infiltration, enabling the quantification of the lymphocyte nuclear density (LND) on myocardial WSIs. A cutoff value based on the quantification of LND was proposed to determine if the myocardial inflammatory infiltration was present. The performance of our approach was evaluated with a five-fold cross-validation experiment, tested with an internal test set from the myocarditis group, and confirmed by an external test from a double-blind trial group. An LND of 1.02/mm2 could distinguish WSIs with myocarditis from those without. The accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) in the five-fold cross-validation experiment were 0.899 plus or minus 0.035, 0.971 plus or minus 0.017, 0.728 plus or minus 0.073 and 0.849 plus or minus 0.044, respectively. For the internal test set, the accuracy, sensitivity, specificity, and AUC were 0.887, 0.971, 0.737, and 0.854, respectively. The accuracy, sensitivity, specificity, and AUC for the external test set reached 0.853, 0.846, 0.858, and 0.852, respectively. Our new approach provides accurate and reliable quantification of the LND of myocardial WSIs, facilitating automated quantitative diagnosis of myocarditis with HE-stained images.
SPJun 2, 2023
A new method using deep transfer learning on ECG to predict the response to cardiac resynchronization therapyZhuo He, Hongjin Si, Xinwei Zhang et al.
Background: Cardiac resynchronization therapy (CRT) has emerged as an effective treatment for heart failure patients with electrical dyssynchrony. However, accurately predicting which patients will respond to CRT remains a challenge. This study explores the application of deep transfer learning techniques to train a predictive model for CRT response. Methods: In this study, the short-time Fourier transform (STFT) technique was employed to transform ECG signals into two-dimensional images. A transfer learning approach was then applied on the MIT-BIT ECG database to pre-train a convolutional neural network (CNN) model. The model was fine-tuned to extract relevant features from the ECG images, and then tested on our dataset of CRT patients to predict their response. Results: Seventy-one CRT patients were enrolled in this study. The transfer learning model achieved an accuracy of 72% in distinguishing responders from non-responders in the local dataset. Furthermore, the model showed good sensitivity (0.78) and specificity (0.79) in identifying CRT responders. The performance of our model outperformed clinic guidelines and traditional machine learning approaches. Conclusion: The utilization of ECG images as input and leveraging the power of transfer learning allows for improved accuracy in identifying CRT responders. This approach offers potential for enhancing patient selection and improving outcomes of CRT.
GNOct 12, 2023
Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted MetabolomicsChen Zhao, Kuan-Jui Su, Chong Wu et al.
Background: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. Method: In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-view variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information. Results: We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved R^2-scores > 0.01 for 71.55% of metabolites. Conclusion: The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research.
LGApr 6
Vehicle-as-Prompt: A Unified Deep Reinforcement Learning Framework for Heterogeneous Fleet Vehicle Routing ProblemShihong Huang, Shengjie Wang, Lei Gao et al.
Unlike traditional homogeneous routing problems, the Heterogeneous Fleet Vehicle Routing Problem (HFVRP) involves heterogeneous fixed costs, variable travel costs, and capacity constraints, rendering solution quality highly sensitive to vehicle selection. Furthermore, real-world logistics applications often impose additional complex constraints, markedly increasing computational complexity. However, most existing Deep Reinforcement Learning (DRL)-based methods are restricted to homogeneous scenarios, leading to suboptimal performance when applied to HFVRP and its complex variants. To bridge this gap, we investigate HFVRP under complex constraints and develop a unified DRL framework capable of solving the problem across various variant settings. We introduce the Vehicle-as-Prompt (VaP) mechanism, which formulates the problem as a single-stage autoregressive decision process. Building on this, we propose VaP-CSMV, a framework featuring a cross-semantic encoder and a multi-view decoder that effectively addresses various problem variants and captures the complex mapping relationships between vehicle heterogeneity and customer node attributes. Extensive experimental results demonstrate that VaP-CSMV significantly outperforms existing state-of-the-art DRL-based neural solvers and achieves competitive solution quality compared to traditional heuristic solvers, while reducing inference time to mere seconds. Furthermore, the framework exhibits strong zero-shot generalization capabilities on large-scale and previously unseen problem variants, while ablation studies validate the vital contribution of each component.
CVAug 20, 2023
Hyper Association Graph Matching with Uncertainty Quantification for Coronary Artery Semantic LabelingChen Zhao, Michele Esposito, Zhihui Xu et al.
Coronary artery disease (CAD) is one of the primary causes leading to death worldwide. Accurate extraction of individual arterial branches on invasive coronary angiograms (ICA) is important for stenosis detection and CAD diagnosis. However, deep learning-based models face challenges in generating semantic segmentation for coronary arteries due to the morphological similarity among different types of coronary arteries. To address this challenge, we propose an innovative approach using the hyper association graph-matching neural network with uncertainty quantification (HAGMN-UQ) for coronary artery semantic labeling on ICAs. The graph-matching procedure maps the arterial branches between two individual graphs, so that the unlabeled arterial segments are classified by the labeled segments, and the coronary artery semantic labeling is achieved. By incorporating the anatomical structural loss and uncertainty, our model achieved an accuracy of 0.9345 for coronary artery semantic labeling with a fast inference speed, leading to an effective and efficient prediction in real-time clinical decision-making scenarios.
IVJan 29, 2023
Incremental Value and Interpretability of Radiomics Features of Both Lung and Epicardial Adipose Tissue for Detecting the Severity of COVID-19 InfectionNi Yao, Yanhui Tian, Daniel Gama das Neves et al.
Epicardial adipose tissue (EAT) is known for its pro-inflammatory properties and association with Coronavirus Disease 2019 (COVID-19) severity. However, current EAT segmentation methods do not consider positional information. Additionally, the detection of COVID-19 severity lacks consideration for EAT radiomics features, which limits interpretability. This study investigates the use of radiomics features from EAT and lungs to detect the severity of COVID-19 infections. A retrospective analysis of 515 patients with COVID-19 (Cohort1: 415, Cohort2: 100) was conducted using a proposed three-stage deep learning approach for EAT extraction. Lung segmentation was achieved using a published method. A hybrid model for detecting the severity of COVID-19 was built in a derivation cohort, and its performance and uncertainty were evaluated in internal (125, Cohort1) and external (100, Cohort2) validation cohorts. For EAT extraction, the Dice similarity coefficients (DSC) of the two centers were 0.972 (+-0.011) and 0.968 (+-0.005), respectively. For severity detection, the hybrid model with radiomics features of both lungs and EAT showed improvements in AUC, net reclassification improvement (NRI), and integrated discrimination improvement (IDI) compared to the model with only lung radiomics features. The hybrid model exhibited an increase of 0.1 (p<0.001), 19.3%, and 18.0% respectively, in the internal validation cohort and an increase of 0.09 (p<0.001), 18.0%, and 18.0%, respectively, in the external validation cohort while outperforming existing detection methods. Uncertainty quantification and radiomics features analysis confirmed the interpretability of case prediction after inclusion of EAT features.
IVJun 29, 2023
MLA-BIN: Model-level Attention and Batch-instance Style Normalization for Domain Generalization of Federated Learning on Medical Image SegmentationFubao Zhu, Yanhui Tian, Chuang Han et al.
The privacy protection mechanism of federated learning (FL) offers an effective solution for cross-center medical collaboration and data sharing. In multi-site medical image segmentation, each medical site serves as a client of FL, and its data naturally forms a domain. FL supplies the possibility to improve the performance of seen domains model. However, there is a problem of domain generalization (DG) in the actual de-ployment, that is, the performance of the model trained by FL in unseen domains will decrease. Hence, MLA-BIN is proposed to solve the DG of FL in this study. Specifically, the model-level attention module (MLA) and batch-instance style normalization (BIN) block were designed. The MLA represents the unseen domain as a linear combination of seen domain models. The atten-tion mechanism is introduced for the weighting coefficient to obtain the optimal coefficient ac-cording to the similarity of inter-domain data features. MLA enables the global model to gen-eralize to unseen domain. In the BIN block, batch normalization (BN) and instance normalization (IN) are combined to perform the shallow layers of the segmentation network for style normali-zation, solving the influence of inter-domain image style differences on DG. The extensive experimental results of two medical image seg-mentation tasks demonstrate that the proposed MLA-BIN outperforms state-of-the-art methods.
CVOct 30, 2025
PF-DAformer: Proximal Femur Segmentation via Domain Adaptive Transformer for Dual-Center QCTRochak Dhakal, Chen Zhao, Zixin Shi et al.
Quantitative computed tomography (QCT) plays a crucial role in assessing bone strength and fracture risk by enabling volumetric analysis of bone density distribution in the proximal femur. However, deploying automated segmentation models in practice remains difficult because deep networks trained on one dataset often fail when applied to another. This failure stems from domain shift, where scanners, reconstruction settings, and patient demographics vary across institutions, leading to unstable predictions and unreliable quantitative metrics. Overcoming this barrier is essential for multi-center osteoporosis research and for ensuring that radiomics and structural finite element analysis results remain reproducible across sites. In this work, we developed a domain-adaptive transformer segmentation framework tailored for multi-institutional QCT. Our model is trained and validated on one of the largest hip fracture related research cohorts to date, comprising 1,024 QCT images scans from Tulane University and 384 scans from Rochester, Minnesota for proximal femur segmentation. To address domain shift, we integrate two complementary strategies within a 3D TransUNet backbone: adversarial alignment via Gradient Reversal Layer (GRL), which discourages the network from encoding site-specific cues, and statistical alignment via Maximum Mean Discrepancy (MMD), which explicitly reduces distributional mismatches between institutions. This dual mechanism balances invariance and fine-grained alignment, enabling scanner-agnostic feature learning while preserving anatomical detail.
IROct 14, 2024Code
SGUQ: Staged Graph Convolution Neural Network for Alzheimer's Disease Diagnosis using Multi-Omics DataLiang Tao, Yixin Xie, Jeffrey D Deng et al.
Alzheimer's disease (AD) is a chronic neurodegenerative disorder and the leading cause of dementia, significantly impacting cost, mortality, and burden worldwide. The advent of high-throughput omics technologies, such as genomics, transcriptomics, proteomics, and epigenomics, has revolutionized the molecular understanding of AD. Conventional AI approaches typically require the completion of all omics data at the outset to achieve optimal AD diagnosis, which are inefficient and may be unnecessary. To reduce the clinical cost and improve the accuracy of AD diagnosis using multi-omics data, we propose a novel staged graph convolutional network with uncertainty quantification (SGUQ). SGUQ begins with mRNA and progressively incorporates DNA methylation and miRNA data only when necessary, reducing overall costs and exposure to harmful tests. Experimental results indicate that 46.23% of the samples can be reliably predicted using only single-modal omics data (mRNA), while an additional 16.04% of the samples can achieve reliable predictions when combining two omics data types (mRNA + DNA methylation). In addition, the proposed staged SGUQ achieved an accuracy of 0.858 on ROSMAP dataset, which outperformed existing methods significantly. The proposed SGUQ can not only be applied to AD diagnosis using multi-omics data but also has the potential for clinical decision-making using multi-viewed data. Our implementation is publicly available at https://github.com/chenzhao2023/multiomicsuncertainty.
CVMay 6, 2025Code
UPMAD-Net: A Brain Tumor Segmentation Network with Uncertainty Guidance and Adaptive Multimodal Feature FusionZhanyuan Jia, Ni Yao, Danyang Sun et al.
Background: Brain tumor segmentation has a significant impact on the diagnosis and treatment of brain tumors. Accurate brain tumor segmentation remains challenging due to their irregular shapes, vague boundaries, and high variability. Objective: We propose a brain tumor segmentation method that combines deep learning with prior knowledge derived from a region-growing algorithm. Methods: The proposed method utilizes a multi-scale feature fusion (MSFF) module and adaptive attention mechanisms (AAM) to extract multi-scale features and capture global contextual information. To enhance the model's robustness in low-confidence regions, the Monte Carlo Dropout (MC Dropout) strategy is employed for uncertainty estimation. Results: Extensive experiments demonstrate that the proposed method achieves superior performance on Brain Tumor Segmentation (BraTS) datasets, significantly outperforming various state-of-the-art methods. On the BraTS2021 dataset, the test Dice scores are 89.18% for Enhancing Tumor (ET) segmentation, 93.67% for Whole Tumor (WT) segmentation, and 91.23% for Tumor Core (TC) segmentation. On the BraTS2019 validation set, the validation Dice scores are 87.43%, 90.92%, and 90.40% for ET, WT, and TC segmentation, respectively. Ablation studies further confirmed the contribution of each module to segmentation accuracy, indicating that each component played a vital role in overall performance improvement. Conclusion: This study proposed a novel 3D brain tumor segmentation network based on the U-Net architecture. By incorporating the prior knowledge and employing the uncertainty estimation method, the robustness and performance were improved. The code for the proposed method is available at https://github.com/chenzhao2023/UPMAD_Net_BrainSeg.
CVApr 27, 2025Code
Myocardial Region-guided Feature Aggregation Net for Automatic Coronary artery Segmentation and Stenosis Assessment using Coronary Computed Tomography AngiographyNi Yao, Xiangyu Liu, Danyang Sun et al.
Coronary artery disease (CAD) remains a leading cause of mortality worldwide, requiring accurate segmentation and stenosis detection using Coronary Computed Tomography angiography (CCTA). Existing methods struggle with challenges such as low contrast, morphological variability and small vessel segmentation. To address these limitations, we propose the Myocardial Region-guided Feature Aggregation Net, a novel U-shaped dual-encoder architecture that integrates anatomical prior knowledge to enhance robustness in coronary artery segmentation. Our framework incorporates three key innovations: (1) a Myocardial Region-guided Module that directs attention to coronary regions via myocardial contour expansion and multi-scale feature fusion, (2) a Residual Feature Extraction Encoding Module that combines parallel spatial channel attention with residual blocks to enhance local-global feature discrimination, and (3) a Multi-scale Feature Fusion Module for adaptive aggregation of hierarchical vascular features. Additionally, Monte Carlo dropout f quantifies prediction uncertainty, supporting clinical interpretability. For stenosis detection, a morphology-based centerline extraction algorithm separates the vascular tree into anatomical branches, enabling cross-sectional area quantification and stenosis grading. The superiority of MGFA-Net was demonstrated by achieving an Dice score of 85.04%, an accuracy of 84.24%, an HD95 of 6.1294 mm, and an improvement of 5.46% in true positive rate for stenosis detection compared to3D U-Net. The integrated segmentation-to-stenosis pipeline provides automated, clinically interpretable CAD assessment, bridging deep learning with anatomical prior knowledge for precision medicine. Our code is publicly available at http://github.com/chenzhao2023/MGFA_CCTA
CVOct 23, 2024Code
MsMorph: An Unsupervised pyramid learning network for brain image registrationJiaofen Nan, Gaodeng Fan, Kaifan Zhang et al.
In the field of medical image analysis, image registration is a crucial technique. Despite the numerous registration models that have been proposed, existing methods still fall short in terms of accuracy and interpretability. In this paper, we present MsMorph, a deep learning-based image registration framework aimed at mimicking the manual process of registering image pairs to achieve more similar deformations, where the registered image pairs exhibit consistency or similarity in features. By extracting the feature differences between image pairs across various as-pects using gradients, the framework decodes semantic information at different scales and continuously compen-sates for the predicted deformation field, driving the optimization of parameters to significantly improve registration accuracy. The proposed method simulates the manual approach to registration, focusing on different regions of the image pairs and their neighborhoods to predict the deformation field between the two images, which provides strong interpretability. We compared several existing registration methods on two public brain MRI datasets, including LPBA and Mindboggle. The experimental results show that our method consistently outperforms state of the art in terms of metrics such as Dice score, Hausdorff distance, average symmetric surface distance, and non-Jacobian. The source code is publicly available at https://github.com/GaodengFan/MsMorph
IVNov 1, 2023
A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma based on CT ImagesNi Yao, Hang Hu, Kaicong Chen et al.
Objectives To develop and validate a deep learning-based diagnostic model incorporating uncertainty estimation so as to facilitate radiologists in the preoperative differentiation of the pathological subtypes of renal cell carcinoma (RCC) based on CT images. Methods Data from 668 consecutive patients, pathologically proven RCC, were retrospectively collected from Center 1. By using five-fold cross-validation, a deep learning model incorporating uncertainty estimation was developed to classify RCC subtypes into clear cell RCC (ccRCC), papillary RCC (pRCC), and chromophobe RCC (chRCC). An external validation set of 78 patients from Center 2 further evaluated the model's performance. Results In the five-fold cross-validation, the model's area under the receiver operating characteristic curve (AUC) for the classification of ccRCC, pRCC, and chRCC was 0.868 (95% CI: 0.826-0.923), 0.846 (95% CI: 0.812-0.886), and 0.839 (95% CI: 0.802-0.88), respectively. In the external validation set, the AUCs were 0.856 (95% CI: 0.838-0.882), 0.787 (95% CI: 0.757-0.818), and 0.793 (95% CI: 0.758-0.831) for ccRCC, pRCC, and chRCC, respectively. Conclusions The developed deep learning model demonstrated robust performance in predicting the pathological subtypes of RCC, while the incorporated uncertainty emphasized the importance of understanding model confidence, which is crucial for assisting clinical decision-making for patients with renal tumors. Clinical relevance statement Our deep learning approach, integrated with uncertainty estimation, offers clinicians a dual advantage: accurate RCC subtype predictions complemented by diagnostic confidence references, promoting informed decision-making for patients with RCC.
MED-PHFeb 17
Imaging-Derived Coronary Fractional Flow Reserve: Advances in Physics-Based, Machine-Learning, and Physics-Informed MethodsTanxin Zhu, Emran Hossen, Chen Zhao et al.
Purpose of Review Imaging derived fractional flow reserve (FFR) is rapidly evolving beyond conventional computational fluid dynamics (CFD) based pipelines toward machine learning (ML), deep learning (DL), and physics informed approaches that enable fast, wire free, and scalable functional assessment of coronary stenosis. This review synthesizes recent advances in CT and angiography based FFR, with particular emphasis on emerging physics informed neural networks and neural operators (PINNs and PINOs) and key considerations for their clinical translation. Recent Findings ML/DL approaches have markedly improved automation and computational speed, enabling prediction of pressure and FFR from anatomical descriptors or angiographic contrast dynamics. However, their real-world performance and generalizability can remain variable and sensitive to domain shift, due to multi-center heterogeneity, interpretability challenges, and differences in acquisition protocols and image quality. Physics informed learning introduces conservation structure and boundary condition consistency into model training, improving generalizability and reducing dependence on dense supervision while maintaining rapid inference. Recent evaluation trends increasingly highlight deployment oriented metrics, including calibration, uncertainty quantification, and quality control gatekeeping, as essential for safe clinical use. Summary The field is converging toward imaging derived FFR methods that are faster, more automated, and more reliable. While ML/DL offers substantial efficiency gains, physics informed frameworks such as PINNs and PINOs may provide a more robust balance between speed and physical consistency. Prospective multi center validation and standardized evaluation will be critical to support broad and safe clinical adoption.
CVApr 27
Point Cloud Registration for Fusion between SPECT MPI and CTA ImagesNi Yao, Xiangyu Liu, Shaojie Tang et al.
Clinical fusion of Single Photon Emission Computed Tomography Myocardial Perfusion Imaging (SPECT MPI) and Computed Tomography Angiography (CTA) remains limited by cross-modality misregistration and reliance on manual landmarks, which can hinder accurate ischemia localization and lesion-level functional assessment. To address this issue, we propose a registration and fusion framework for SPECT MPI and CTA that integrates functional and structural information for comprehensive cardiac evaluation. The proposed pipeline performs U-Net-based segmentation on both modalities. On SPECT MPI, only the left ventricle (LV) is extracted, and anatomical landmarks are automatically derived from characteristic LV structures. On CTA, both ventricles are segmented, and their spatial relationship is used to automatically define landmarks at the interventricular septal junction. Scale-space consistency preprocessing and landmark-driven coarse registration are applied to mitigate initial misalignment. Based on this initialization, multiple fine registration methods are evaluated on LV epicardial surface point clouds, including ICP, SICP, CPD, CluReg, FFD, and BCPD-plus-plus. The resulting transformations are then propagated to voxel-level resampling for high-precision SPECT-CTA fusion. In a retrospective cohort of 60 patients, the proposed framework preserved sub-millimeter coronary detail from CTA while accurately overlaying quantitative SPECT perfusion. Among the evaluated methods, BCPD-plus-plus achieved the highest accuracy with a mean point cloud distance of 1.7 mm. By combining robust initialization, comparative fine registration, and voxel-level fusion, the proposed approach provides a practical solution for myocardial ischemia localization and functional evaluation of coronary lesions, while remaining independent of any specific fine registration algorithm.
AIMar 19
Accurate and Efficient Multi-Channel Time Series Forecasting via Sparse Attention MechanismLei Gao, Hengda Bao, Jingfei Fang et al.
The task of multi-channel time series forecasting is ubiquitous in numerous fields such as finance, supply chain management, and energy planning. It is critical to effectively capture complex dynamic dependencies within and between channels for accurate predictions. However, traditional method paid few attentions on learning the interaction among channels. This paper proposes Linear-Network (Li-Net), a novel architecture designed for multi-channel time series forecasting that captures the linear and non-linear dependencies among channels. Li-Net dynamically compresses representations across sequence and channel dimensions, processes the information through a configurable non-linear module and subsequently reconstructs the forecasts. Moreover, Li-Net integrates a sparse Top-K Softmax attention mechanism within a multi-scale projection framework to address these challenges. A core innovation is its ability to seamlessly incorporate and fuse multi-modal embeddings, guiding the sparse attention process to focus on the most informative time steps and feature channels. Through the experiment results on multiple real-world benchmark datasets demonstrate that Li-Net achieves competitive performance compared to state-of-the-art baseline methods. Furthermore, Li-Net provides a superior balance between prediction accuracy and computational burden, exhibiting significantly lower memory usage and faster inference times. Detailed ablation studies and parameter sensitivity analyses validate the effectiveness of each key component in our proposed architecture. Keywords: Multivariate Time Series Forecasting, Sparse Attention Mechanism, Multimodal Information Fusion, Non-linear relationship
CVApr 21, 2025
ICGM-FRAX: Iterative Cross Graph Matching for Hip Fracture Risk Assessment using Dual-energy X-ray Absorptiometry ImagesChen Zhao, Anjum Shaik, Joyce H. Keyak et al.
Hip fractures represent a major health concern, particularly among the elderly, often leading decreased mobility and increased mortality. Early and accurate detection of at risk individuals is crucial for effective intervention. In this study, we propose Iterative Cross Graph Matching for Hip Fracture Risk Assessment (ICGM-FRAX), a novel approach for predicting hip fractures using Dual-energy X-ray Absorptiometry (DXA) images. ICGM-FRAX involves iteratively comparing a test (subject) graph with multiple template graphs representing the characteristics of hip fracture subjects to assess the similarity and accurately to predict hip fracture risk. These graphs are obtained as follows. The DXA images are separated into multiple regions of interest (RoIs), such as the femoral head, shaft, and lesser trochanter. Radiomic features are then calculated for each RoI, with the central coordinates used as nodes in a graph. The connectivity between nodes is established according to the Euclidean distance between these coordinates. This process transforms each DXA image into a graph, where each node represents a RoI, and edges derived by the centroids of RoIs capture the spatial relationships between them. If the test graph closely matches a set of template graphs representing subjects with incident hip fractures, it is classified as indicating high hip fracture risk. We evaluated our method using 547 subjects from the UK Biobank dataset, and experimental results show that ICGM-FRAX achieved a sensitivity of 0.9869, demonstrating high accuracy in predicting hip fractures.
CVFeb 24, 2024
Multi-graph Graph Matching for Coronary Artery Semantic LabelingChen Zhao, Zhihui Xu, Pukar Baral et al.
Coronary artery disease (CAD) stands as the leading cause of death worldwide, and invasive coronary angiography (ICA) remains the gold standard for assessing vascular anatomical information. However, deep learning-based methods encounter challenges in generating semantic labels for arterial segments, primarily due to the morphological similarity between arterial branches and varying anatomy of arterial system between different projection view angles and patients. To address this challenge, we model the vascular tree as a graph and propose a multi-graph graph matching (MGM) algorithm for coronary artery semantic labeling. The MGM algorithm assesses the similarity between arterials in multiple vascular tree graphs, considering the cycle consistency between each pair of graphs. As a result, the unannotated arterial segments are appropriately labeled by matching them with annotated segments. Through the incorporation of anatomical graph structure, radiomics features, and semantic mapping, the proposed MGM model achieves an impressive accuracy of 0.9471 for coronary artery semantic labeling using our multi-site dataset with 718 ICAs. With the semantic labeled arteries, an overall accuracy of 0.9155 was achieved for stenosis detection. The proposed MGM presents a novel tool for coronary artery analysis using multiple ICA-derived graphs, offering valuable insights into vascular health and pathology.
IVFeb 4, 2024
3D Lymphoma Segmentation on PET/CT Images via Multi-Scale Information Fusion with Cross-AttentionHuan Huang, Liheng Qiu, Shenmiao Yang et al.
Background: Accurate segmentation of diffuse large B-cell lymphoma (DLBCL) lesions is challenging due to their complex patterns in medical imaging. Objective: This study aims to develop a precise segmentation method for DLBCL using 18F-Fluorodeoxyglucose (FDG) positron emission tomography (PET) and computed tomography (CT) images. Methods: We propose a 3D dual-branch encoder segmentation method using shifted window transformers and a Multi-Scale Information Fusion (MSIF) module. To enhance feature integration, the MSIF module performs multi-scale feature fusion using cross-attention mechanisms with a shifted window framework. A gated neural network within the MSIF module dynamically balances the contributions from each modality. The model was optimized using the Dice Similarity Coefficient (DSC) loss function. Additionally, we computed the total metabolic tumor volume (TMTV) and performed statistical analyses. Results: The model was trained and validated on a dataset of 165 DLBCL patients using 5-fold cross-validation, achieving a DSC of 0.7512. Statistical analysis showed a significant improvement over comparative methods (p < 0.05). Additionally, a Pearson correlation coefficient of 0.91 and an R^2 of 0.89 were observed when comparing manual annotations to segmentation results for TMTV measurement. Conclusion: This study presents an effective automatic segmentation method for DLBCL that leverages the complementary strengths of PET and CT imaging. Our method has the potential to improve diagnostic interpretations and assist in treatment planning for DLBCL patients.
LGFeb 20
Improving Generalizability of Hip Fracture Risk Prediction via Domain Adaptation Across Multiple CohortsShuo Sun, Meiling Zhou, Chen Zhao et al.
Clinical risk prediction models often fail to be generalized across cohorts because underlying data distributions differ by clinical site, region, demographics, and measurement protocols. This limitation is particularly pronounced in hip fracture risk prediction, where the performance of models trained on one cohort (the source cohort) can degrade substantially when deployed in other cohorts (target cohorts). We used a shared set of clinical and DXA-derived features across three large cohorts - the Study of Osteoporotic Fractures (SOF), the Osteoporotic Fractures in Men Study (MrOS), and the UK Biobank (UKB), to systematically evaluate the performance of three domain adaptation methods - Maximum Mean Discrepancy (MMD), Correlation Alignment (CORAL), and Domain - Adversarial Neural Networks (DANN) and their combinations. For a source cohort with males only and a source cohort with females only, domain-adaptation methods consistently showed improved performance than the no-adaptation baseline (source-only training), and the use of combinations of multiple domain adaptation methods delivered the largest and most stable gains. The method that combines MMD, CORAL, and DANN achieved the highest discrimination with the area under curve (AUC) of 0.88 for a source cohort with males only and 0.95 for a source cohort with females only), demonstrating that integrating multiple domain adaptation methods could produce feature representations that are less sensitive to dataset differences. Unlike existing methods that rely heavily on supervised tuning or assume known outcomes of samples in target cohorts, our outcome-free approaches enable the model selection under realistic deployment conditions and improve generalization of models in hip fracture risk prediction.
CVDec 5, 2025
UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease DetectionFubao Zhu, Zhanyuan Jia, Zhiguo Wang et al.
Alzheimer's disease (AD) is an irreversible neurodegenerative disorder, and early diagnosis is critical for timely intervention. However, most existing classification frameworks face challenges in multicenter studies, as they often neglect inter-site heterogeneity and lack mechanisms to quantify uncertainty, which limits their robustness and clinical applicability. To address these issues, we proposed Uncertainty-Guided Federated Domain Adaptation (UG-FedDA), a novel multicenter AD classification framework that integrates uncertainty quantification (UQ) with federated domain adaptation to handle cross-site structure magnetic resonance imaging (MRI) heterogeneity under privacy constraints. Our approach extracts multi-template region-of-interest (RoI) features using a self-attention transformer, capturing both regional representations and their interactions. UQ is integrated to guide feature alignment, mitigating source-target distribution shifts by down-weighting uncertain samples. Experiments are conducted on three public datasets: the Alzheimer's Disease Neuroimaging Initiative (ADNI), the Australian Imaging, Biomarkers and Lifestyle study (AIBL), and the Open Access Series of Imaging Studies (OASIS). UG-FedDA achieved consistent cross-domain improvements in accuracy, sensitivity, and area under the ROC curve across three classification tasks: AD vs. normal controls (NC), mild cognitive impairment (MCI) vs. AD, and NC vs. MCI. For NC vs. AD, UG-FedDA achieves accuracies of 90.54%, 89.04%, and 77.78% on ADNI, AIBL and OASIS datasets, respectively. For MCI vs. AD, accuracies are 80.20% (ADNI), 71.91% (AIBL), and 79.73% (OASIS). For NC vs. MCI, results are 76.87% (ADNI), 73.91% (AIBL), and 83.73% (OASIS). These results demonstrate that the proposed framework not only adapts efficiently across multiple sites but also preserves strict privacy.
LGOct 16, 2025
An Advanced Two-Stage Model with High Sensitivity and Generalizability for Prediction of Hip Fracture Risk Using Multiple DatasetsShuo Sun, Meiling Zhou, Chen Zhao et al.
Hip fractures are a major cause of disability, mortality, and healthcare burden in older adults, underscoring the need for early risk assessment. However, commonly used tools such as the DXA T-score and FRAX often lack sensitivity and miss individuals at high risk, particularly those without prior fractures or with osteopenia. To address this limitation, we propose a sequential two-stage model that integrates clinical and imaging information to improve prediction accuracy. Using data from the Osteoporotic Fractures in Men Study (MrOS), the Study of Osteoporotic Fractures (SOF), and the UK Biobank, Stage 1 (Screening) employs clinical, demographic, and functional variables to estimate baseline risk, while Stage 2 (Imaging) incorporates DXA-derived features for refinement. The model was rigorously validated through internal and external testing, showing consistent performance and adaptability across cohorts. Compared to T-score and FRAX, the two-stage framework achieved higher sensitivity and reduced missed cases, offering a cost-effective and personalized approach for early hip fracture risk assessment. Keywords: Hip Fracture, Two-Stage Model, Risk Prediction, Sensitivity, DXA, FRAX
LGOct 11, 2025
One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing ProblemLei Gao, Shihong Huang, Shengjie Wang et al.
The three-dimensional bin packing problem (3D-BPP) is widely applied in logistics and warehousing. Existing learning-based approaches often neglect practical stability-related constraints and exhibit limitations in generalizing across diverse bin dimensions. To address these limitations, we propose a novel deep reinforcement learning framework, One4Many-StablePacker (O4M-SP). The primary advantage of O4M-SP is its ability to handle various bin dimensions in a single training process while incorporating support and weight constraints common in practice. Our training method introduces two innovative mechanisms. First, it employs a weighted reward function that integrates loading rate and a new height difference metric for packing layouts, promoting improved bin utilization through flatter packing configurations. Second, it combines clipped policy gradient optimization with a tailored policy drifting method to mitigate policy entropy collapse, encouraging exploration at critical decision nodes during packing to avoid suboptimal solutions. Extensive experiments demonstrate that O4M-SP generalizes successfully across diverse bin dimensions and significantly outperforms baseline methods. Furthermore, O4M-SP exhibits strong practical applicability by effectively addressing packing scenarios with stability constraints.
IVMar 28, 2025
Diagnosis of Pulmonary Hypertension by Integrating Multimodal Data with a Hybrid Graph Convolutional and Transformer NetworkFubao Zhu, Yang Zhang, Gengmin Liang et al.
Early and accurate diagnosis of pulmonary hypertension (PH) is essential for optimal patient management. Differentiating between pre-capillary and post-capillary PH is critical for guiding treatment decisions. This study develops and validates a deep learning-based diagnostic model for PH, designed to classify patients as non-PH, pre-capillary PH, or post-capillary PH. This retrospective study analyzed data from 204 patients (112 with pre-capillary PH, 32 with post-capillary PH, and 60 non-PH controls) at the First Affiliated Hospital of Nanjing Medical University. Diagnoses were confirmed through right heart catheterization. We selected 6 samples from each category for the test set (18 samples, 10%), with the remaining 186 samples used for the training set. This process was repeated 35 times for testing. This paper proposes a deep learning model that combines Graph convolutional networks (GCN), Convolutional neural networks (CNN), and Transformers. The model was developed to process multimodal data, including short-axis (SAX) sequences, four-chamber (4CH) sequences, and clinical parameters. Our model achieved a performance of Area under the receiver operating characteristic curve (AUC) = 0.81 +- 0.06(standard deviation) and Accuracy (ACC) = 0.73 +- 0.06 on the test set. The discriminative abilities were as follows: non-PH subjects (AUC = 0.74 +- 0.11), pre-capillary PH (AUC = 0.86 +- 0.06), and post-capillary PH (AUC = 0.83 +- 0.10). It has the potential to support clinical decision-making by effectively integrating multimodal data to assist physicians in making accurate and timely diagnoses.
IVFeb 23, 2025
FedDA-TSformer: Federated Domain Adaptation with Vision TimeSformer for Left Ventricle Segmentation on Gated Myocardial Perfusion SPECT ImageYehong Huang, Chen Zhao, Rochak Dhakal et al.
Background and Purpose: Functional assessment of the left ventricle using gated myocardial perfusion (MPS) single-photon emission computed tomography relies on the precise extraction of the left ventricular contours while simultaneously ensuring the security of patient data. Methods: In this paper, we introduce the integration of Federated Domain Adaptation with TimeSformer, named 'FedDA-TSformer' for left ventricle segmentation using MPS. FedDA-TSformer captures spatial and temporal features in gated MPS images, leveraging spatial attention, temporal attention, and federated learning for improved domain adaptation while ensuring patient data security. In detail, we employed Divide-Space-Time-Attention mechanism to extract spatio-temporal correlations from the multi-centered MPS datasets, ensuring that predictions are spatio-temporally consistent. To achieve domain adaptation, we align the model output on MPS from three different centers using local maximum mean discrepancy (LMMD) loss. This approach effectively addresses the dual requirements of federated learning and domain adaptation, enhancing the model's performance during training with multi-site datasets while ensuring the protection of data from different hospitals. Results: Our FedDA-TSformer was trained and evaluated using MPS datasets collected from three hospitals, comprising a total of 150 subjects. Each subject's cardiac cycle was divided into eight gates. The model achieved Dice Similarity Coefficients (DSC) of 0.842 and 0.907 for left ventricular (LV) endocardium and epicardium segmentation, respectively. Conclusion: Our proposed FedDA-TSformer model addresses the challenge of multi-center generalization, ensures patient data privacy protection, and demonstrates effectiveness in left ventricular (LV) segmentation.
IVFeb 10, 2024
Point cloud-based registration and image fusion between cardiac SPECT MPI and CTAShaojie Tang, Penpen Miao, Xingyu Gao et al.
A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point clouds of the LV epicardial contours (LVECs). Secondly, according to the characteristics of cardiac anatomy, the special points of anterior and posterior interventricular grooves (APIGs) were manually marked in both SPECT and CTA image volumes. Thirdly, we developed an in-house program for coarsely registering the special points of APIGs to ensure a correct cardiac orientation alignment between SPECT and CTA images. Fourthly, we employed ICP, SICP or CPD algorithm to achieve a fine registration for the point clouds (together with the special points of APIGs) of the LV epicardial surfaces (LVERs) in SPECT and CTA images. Finally, the image fusion between SPECT and CTA was realized after the fine registration. The experimental results showed that the cardiac orientation was aligned well and the mean distance error of the optimal registration method (CPD with affine transform) was consistently less than 3 mm. The proposed method could effectively fuse the structures from cardiac CTA and SPECT functional images, and demonstrated a potential in assisting in accurate diagnosis of cardiac diseases by combining complementary advantages of the two imaging modalities.
CVMay 21, 2023
Coronary Artery Semantic Labeling using Edge Attention Graph Matching NetworkChen Zhao, Zhihui Xu, Guang-Uei Hung et al.
Coronary artery disease (CAD) is one of the primary causes leading deaths worldwide. The presence of atherosclerotic lesions in coronary arteries is the underlying pathophysiological basis of CAD, and accurate extraction of individual arterial branches using invasive coronary angiography (ICA) is crucial for stenosis detection and CAD diagnosis. We propose an innovative approach called the Edge Attention Graph Matching Network (EAGMN) for coronary artery semantic labeling. By converting the coronary artery semantic segmentation task into a graph node similarity comparison task, identifying the node-to-node correspondence would assign semantic labels for each arterial branch. More specifically, The EAGMN utilizes the association graph constructed from the two individual graphs as input. Experimental results indicate the EAGMN achieved a weighted accuracy of 0.8653, a weighted precision of 0.8656, a weighted recall of 0.8653 and a weighted F1-score of 0.8643. Furthermore, we employ ZORRO to provide interpretability and explainability of the graph matching for artery semantic labeling. These findings highlight the potential of the EAGMN for accurate and efficient coronary artery semantic labeling using ICAs. By leveraging the inherent characteristics of ICAs and incorporating graph matching techniques, our proposed model provides a promising solution for improving CAD diagnosis and treatment
CVMay 4, 2023
A new method using deep learning to predict the response to cardiac resynchronization therapyKristoffer Larsena, Zhuo He, Chen Zhao et al.
Background. Clinical parameters measured from gated single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI) have value in predicting cardiac resynchronization therapy (CRT) patient outcomes, but still show limitations. The purpose of this study is to combine clinical variables, features from electrocardiogram (ECG), and parameters from assessment of cardiac function with polarmaps from gated SPECT MPI through deep learning (DL) to predict CRT response. Methods. 218 patients who underwent rest gated SPECT MPI were enrolled in this study. CRT response was defined as an increase in left ventricular ejection fraction (LVEF) > 5% at a 6-month follow up. A DL model was constructed by combining a pre-trained VGG16 module and a multilayer perceptron. Two modalities of data were input to the model: polarmap images from SPECT MPI and tabular data from clinical features and ECG parameters. Gradient-weighted Class Activation Mapping (Grad-CAM) was applied to the VGG16 module to provide explainability for the polarmaps. For comparison, four machine learning (ML) models were trained using only the tabular features. Results. Modeling was performed on 218 patients who underwent CRT implantation with a response rate of 55.5% (n = 121). The DL model demonstrated average AUC (0.83), accuracy (0.73), sensitivity (0.76), and specificity (0.69) surpassing the ML models and guideline criteria. Guideline recommendations presented accuracy (0.53), sensitivity (0.75), and specificity (0.26). Conclusions. The DL model outperformed the ML models, showcasing the additional predictive benefit of utilizing SPECT MPI polarmaps. Incorporating additional patient data directly in the form of medical imagery can improve CRT response prediction.
CVFeb 21, 2022
A Novel Architecture Slimming Method for Network Pruning and Knowledge DistillationDongqi Wang, Shengyu Zhang, Zhipeng Di et al.
Network pruning and knowledge distillation are two widely-known model compression methods that efficiently reduce computation cost and model size. A common problem in both pruning and distillation is to determine compressed architecture, i.e., the exact number of filters per layer and layer configuration, in order to preserve most of the original model capacity. In spite of the great advances in existing works, the determination of an excellent architecture still requires human interference or tremendous experimentations. In this paper, we propose an architecture slimming method that automates the layer configuration process. We start from the perspective that the capacity of the over-parameterized model can be largely preserved by finding the minimum number of filters preserving the maximum parameter variance per layer, resulting in a thin architecture. We formulate the determination of compressed architecture as a one-step orthogonal linear transformation, and integrate principle component analysis (PCA), where the variances of filters in the first several projections are maximized. We demonstrate the rationality of our analysis and the effectiveness of the proposed method through extensive experiments. In particular, we show that under the same overall compression rate, the compressed architecture determined by our method shows significant performance gain over baselines after pruning and distillation. Surprisingly, we find that the resulting layer-wise compression rates correspond to the layer sensitivities found by existing works through tremendous experimentations.
IVOct 11, 2021
Spatial-temporal V-Net for automatic segmentation and quantification of right ventricles in gated myocardial perfusion SPECT imagesChen Zhao, Shi Shi, Zhuo He et al.
Background. Functional assessment of right ventricle (RV) using gated myocardial perfusion single-photon emission computed tomography (MPS) heavily relies on the precise extraction of right ventricular contours. In this paper, we present a new deep-learning-based model integrating both the spatial and temporal features in gated MPS images to perform the segmentation of the RV epicardium and endocardium. Methods. By integrating the spatial features from each cardiac frame of the gated MPS and the temporal features from the sequential cardiac frames of the gated MPS, we developed a Spatial-Temporal V-Net (ST-VNet) for automatic extraction of RV endocardial and epicardial contours. In the ST-VNet, a V-Net is employed to hierarchically extract spatial features, and convolutional long-term short-term memory (ConvLSTM) units are added to the skip-connection pathway to extract the temporal features. The input of the ST-VNet is ECG-gated sequential frames of the MPS images and the output is the probability map of the epicardial or endocardial masks. A Dice similarity coefficient (DSC) loss which penalizes the discrepancy between the model prediction and the ground truth was adopted to optimize the segmentation model. Results. Our segmentation model was trained and validated on a retrospective dataset with 45 subjects, and the cardiac cycle of each subject was divided into 8 gates. The proposed ST-VNet achieved a DSC of 0.8914 and 0.8157 for the RV epicardium and endocardium segmentation, respectively. The mean absolute error, the mean squared error, and the Pearson correlation coefficient of the RV ejection fraction (RVEF) between the ground truth and the model prediction were 0.0609, 0.0830, and 0.6985. Conclusion. Our proposed ST-VNet is an effective model for RV segmentation. It has great promise for clinical use in RV functional assessment.
IVOct 6, 2021
Automatic Identification of the End-Diastolic and End-Systolic Cardiac Frames from Invasive Coronary Angiography VideosYinghui Meng, Minghao Dong, Xumin Dai et al.
Automatic identification of proper image frames at the end-diastolic (ED) and end-systolic (ES) frames during the review of invasive coronary angiograms (ICA) is important to assess blood flow during a cardiac cycle, reconstruct the 3D arterial anatomy from bi-planar views, and generate the complementary fusion map with myocardial images. The current identification method primarily relies on visual interpretation, making it not only time-consuming but also less reproducible. In this paper, we propose a new method to automatically identify angiographic image frames associated with the ED and ES cardiac phases by using the trajectories of key vessel points (i.e. landmarks). More specifically, a detection algorithm is first used to detect the key points of coronary arteries, and then an optical flow method is employed to track the trajectories of the selected key points. The ED and ES frames are identified based on all these trajectories. Our method was tested with 62 ICA videos from two separate medical centers (22 and 9 patients in sites 1 and 2, respectively). Comparing consensus interpretations by two human expert readers, excellent agreement was achieved by the proposed algorithm: the agreement rates within a one-frame range were 92.99% and 92.73% for the automatic identification of the ED and ES image frames, respectively. In conclusion, the proposed automated method showed great potential for being an integral part of automated ICA image analysis.
MED-PHJun 1, 2021
A method using deep learning to discover new predictors of CRT response from mechanical dyssynchrony on gated SPECT MPIZhuo He, Xinwei Zhang, Chen Zhao et al.
Background. Studies have shown that the conventional left ventricular mechanical dyssynchrony (LVMD) parameters have their own statistical limitations. The purpose of this study is to extract new LVMD parameters from the phase analysis of gated SPECT MPI by deep learning to help CRT patient selection. Methods. One hundred and three patients who underwent rest gated SPECT MPI were enrolled in this study. CRT response was defined as a decrease in left ventricular end-systolic volume (LVESV) >= 15% at 6 +- 1 month follow up. Autoencoder (AE), an unsupervised deep learning method, was trained by the raw LV systolic phase polar maps to extract new LVMD parameters, called AE-based LVMD parameters. Correlation analysis was used to explain the relationships between new parameters with conventional LVMD parameters. Univariate and multivariate analyses were used to establish a multivariate model for predicting CRT response. Results. Complete data were obtained in 102 patients, 44.1% of them were classified as CRT responders. AE-based LVMD parameter was significant in the univariate (OR 1.24, 95% CI 1.07 - 1.44, P = 0.006) and multivariate analyses (OR 1.03, 95% CI 1.01 - 1.06, P = 0.006). Moreover, it had incremental value over PSD (AUC 0.72 vs. 0.63, LH 8.06, P = 0.005) and PBW (AUC 0.72 vs. 0.64, LH 7.87, P = 0.005), combined with significant clinic characteristics, including LVEF and gender. Conclusions. The new LVMD parameters extracted by autoencoder from the baseline gated SPECT MPI has the potential to improve the prediction of CRT response.
CVFeb 21, 2021
A Deep Learning-based Method to Extract Lumen and Media-Adventitia in Intravascular Ultrasound ImagesFubao Zhu, Zhengyuan Gao, Chen Zhao et al.
Intravascular ultrasound (IVUS) imaging allows direct visualization of the coronary vessel wall and is suitable for the assessment of atherosclerosis and the degree of stenosis. Accurate segmentation and measurements of lumen and median-adventitia (MA) from IVUS are essential for such a successful clinical evaluation. However, current segmentation relies on manual operations, which is time-consuming and user-dependent. In this paper, we aim to develop a deep learning-based method using an encoder-decoder deep architecture to automatically extract both lumen and MA border. Our method named IVUS-U-Net++ is an extension of the well-known U-Net++ model. More specifically, a feature pyramid network was added to the U-Net++ model, enabling the utilization of feature maps at different scales. As a result, the accuracy of the probability map and subsequent segmentation have been improved We collected 1746 IVUS images from 18 patients in this study. The whole dataset was split into a training dataset (1572 images) for the 10-fold cross-validation and a test dataset (174 images) for evaluating the performance of models. Our IVUS-U-Net++ segmentation model achieved a Jaccard measure (JM) of 0.9412, a Hausdorff distance (HD) of 0.0639 mm for the lumen border, and a JM of 0.9509, an HD of 0.0867 mm for the MA border, respectively. Moreover, the Pearson correlation and Bland-Altman analyses were performed to evaluate the correlations of 12 clinical parameters measured from our segmentation results and the ground truth, and automatic measurements agreed well with those from the ground truth (all Ps<0.01). In conclusion, our preliminary results demonstrate that the proposed IVUS-U-Net++ model has great promise for clinical use.
CVFeb 3, 2021
A Deep Learning-Based Approach to Extracting Periosteal and Endosteal Contours of Proximal Femur in Quantitative CT ImagesYu Deng, Ling Wang, Chen Zhao et al.
Automatic CT segmentation of proximal femur is crucial for the diagnosis and risk stratification of orthopedic diseases; however, current methods for the femur CT segmentation mainly rely on manual interactive segmentation, which is time-consuming and has limitations in both accuracy and reproducibility. In this study, we proposed an approach based on deep learning for the automatic extraction of the periosteal and endosteal contours of proximal femur in order to differentiate cortical and trabecular bone compartments. A three-dimensional (3D) end-to-end fully convolutional neural network, which can better combine the information between neighbor slices and get more accurate segmentation results, was developed for our segmentation task. 100 subjects aged from 50 to 87 years with 24,399 slices of proximal femur CT images were enrolled in this study. The separation of cortical and trabecular bone derived from the QCT software MIAF-Femur was used as the segmentation reference. We randomly divided the whole dataset into a training set with 85 subjects for 10-fold cross-validation and a test set with 15 subjects for evaluating the performance of models. Two models with the same network structures were trained and they achieved a dice similarity coefficient (DSC) of 97.87% and 96.49% for the periosteal and endosteal contours, respectively. To verify the excellent performance of our model for femoral segmentation, we measured the volume of different parts of the femur and compared it with the ground truth and the relative errors between predicted result and ground truth are all less than 5%. It demonstrated a strong potential for clinical use, including the hip fracture risk prediction and finite element analysis.
IVJan 25, 2021
A new approach to extracting coronary arteries and detecting stenosis in invasive coronary angiogramsChen Zhao, Haipeng Tang, Daniel McGonigle et al.
In stable coronary artery disease (CAD), reduction in mortality and/or myocardial infarction with revascularization over medical therapy has not been reliably achieved. Coronary arteries are usually extracted to perform stenosis detection. We aim to develop an automatic algorithm by deep learning to extract coronary arteries from ICAs.In this study, a multi-input and multi-scale (MIMS) U-Net with a two-stage recurrent training strategy was proposed for the automatic vessel segmentation. Incorporating features such as the Inception residual module with depth-wise separable convolutional layers, the proposed model generated a refined prediction map with the following two training stages: (i) Stage I coarsely segmented the major coronary arteries from pre-processed single-channel ICAs and generated the probability map of vessels; (ii) during the Stage II, a three-channel image consisting of the original preprocessed image, a generated probability map, and an edge-enhanced image generated from the preprocessed image was fed to the proposed MIMS U-Net to produce the final segmentation probability map. During the training stage, the probability maps were iteratively and recurrently updated by feeding into the neural network. After segmentation, an arterial stenosis detection algorithm was developed to extract vascular centerlines and calculate arterial diameters to evaluate stenotic level. Experimental results demonstrated that the proposed method achieved an average Dice score of 0.8329, an average sensitivity of 0.8281, and an average specificity of 0.9979 in our dataset with 294 ICAs obtained from 73 patient. Moreover, our stenosis detection algorithm achieved a true positive rate of 0.6668 and a positive predictive value of 0.7043.
MED-PHJun 9, 2020
A Deep Learning-Based Method for Automatic Segmentation of Proximal Femur from Quantitative Computed Tomography ImagesChen Zhao, Joyce H. Keyak, Jinshan Tang et al.
Purpose: Proximal femur image analyses based on quantitative computed tomography (QCT) provide a method to quantify the bone density and evaluate osteoporosis and risk of fracture. We aim to develop a deep-learning-based method for automatic proximal femur segmentation. Methods and Materials: We developed a 3D image segmentation method based on V-Net, an end-to-end fully convolutional neural network (CNN), to extract the proximal femur QCT images automatically. The proposed V-net methodology adopts a compound loss function, which includes a Dice loss and a L2 regularizer. We performed experiments to evaluate the effectiveness of the proposed segmentation method. In the experiments, a QCT dataset which included 397 QCT subjects was used. For the QCT image of each subject, the ground truth for the proximal femur was delineated by a well-trained scientist. During the experiments for the entire cohort then for male and female subjects separately, 90% of the subjects were used in 10-fold cross-validation for training and internal validation, and to select the optimal parameters of the proposed models; the rest of the subjects were used to evaluate the performance of models. Results: Visual comparison demonstrated high agreement between the model prediction and ground truth contours of the proximal femur portion of the QCT images. In the entire cohort, the proposed model achieved a Dice score of 0.9815, a sensitivity of 0.9852 and a specificity of 0.9992. In addition, an R2 score of 0.9956 (p<0.001) was obtained when comparing the volumes measured by our model prediction with the ground truth. Conclusion: This method shows a great promise for clinical application to QCT and QCT-based finite element analysis of the proximal femur for evaluating osteoporosis and hip fracture risk.