Vicente Grau

IV
h-index42
30papers
819citations
Novelty44%
AI Score54

30 Papers

IVApr 28, 2023Code
Segment Anything Model for Medical Images?

Yuhao Huang, Xin Yang, Lian Liu et al.

The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's performance on medical data, we collected and sorted 53 open-source datasets and built a large medical segmentation dataset with 18 modalities, 84 objects, 125 object-modality paired targets, 1050K 2D images, and 6033K masks. We comprehensively analyzed different models and strategies on the so-called COSMOS 1050K dataset. Our findings mainly include the following: 1) SAM showed remarkable performance in some specific objects but was unstable, imperfect, or even totally failed in other situations. 2) SAM with the large ViT-H showed better overall performance than that with the small ViT-B. 3) SAM performed better with manual hints, especially box, than the Everything mode. 4) SAM could help human annotation with high labeling quality and less time. 5) SAM was sensitive to the randomness in the center point and tight box prompts, and may suffer from a serious performance drop. 6) SAM performed better than interactive methods with one or a few points, but will be outpaced as the number of points increases. 7) SAM's performance correlated to different factors, including boundary complexity, intensity differences, etc. 8) Finetuning the SAM on specific medical tasks could improve its average DICE performance by 4.39% and 6.68% for ViT-B and ViT-H, respectively. We hope that this comprehensive report can help researchers explore the potential of SAM applications in MIS, and guide how to appropriately use and develop SAM.

CVAug 8, 2022
Deep Computational Model for the Inference of Ventricular Activation Properties

Lei Li, Julia Camps, Abhirup Banerjee et al. · oxford

Patient-specific cardiac computational models are essential for the efficient realization of precision medicine and in-silico clinical trials using digital twins. Cardiac digital twins can provide non-invasive characterizations of cardiac functions for individual patients, and therefore are promising for the patient-specific diagnosis and therapy stratification. However, current workflows for both the anatomical and functional twinning phases, referring to the inference of model anatomy and parameter from clinical data, are not sufficiently efficient, robust, and accurate. In this work, we propose a deep learning based patient-specific computational model, which can fuse both anatomical and electrophysiological information for the inference of ventricular activation properties, i.e., conduction velocities and root nodes. The activation properties can provide a quantitative assessment of cardiac electrophysiological function for the guidance of interventional procedures. We employ the Eikonal model to generate simulated electrocardiogram (ECG) with ground truth properties to train the inference model, where specific patient information has also been considered. For evaluation, we test the model on the simulated data and obtain generally promising results with fast computational time.

IVJul 17, 2023
Multi-class point cloud completion networks for 3D cardiac anatomy reconstruction from cine magnetic resonance images

Marcel Beetz, Abhirup Banerjee, Julius Ossenberg-Engels et al. · oxford

Cine magnetic resonance imaging (MRI) is the current gold standard for the assessment of cardiac anatomy and function. However, it typically only acquires a set of two-dimensional (2D) slices of the underlying three-dimensional (3D) anatomy of the heart, thus limiting the understanding and analysis of both healthy and pathological cardiac morphology and physiology. In this paper, we propose a novel fully automatic surface reconstruction pipeline capable of reconstructing multi-class 3D cardiac anatomy meshes from raw cine MRI acquisitions. Its key component is a multi-class point cloud completion network (PCCN) capable of correcting both the sparsity and misalignment issues of the 3D reconstruction task in a unified model. We first evaluate the PCCN on a large synthetic dataset of biventricular anatomies and observe Chamfer distances between reconstructed and gold standard anatomies below or similar to the underlying image resolution for multiple levels of slice misalignment. Furthermore, we find a reduction in reconstruction error compared to a benchmark 3D U-Net by 32% and 24% in terms of Hausdorff distance and mean surface distance, respectively. We then apply the PCCN as part of our automated reconstruction pipeline to 1000 subjects from the UK Biobank study in a cross-domain transfer setting and demonstrate its ability to reconstruct accurate and topologically plausible biventricular heart meshes with clinical metrics comparable to the previous literature. Finally, we investigate the robustness of our proposed approach and observe its capacity to successfully handle multiple common outlier conditions.

CVAug 8, 2024Code
Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation

Junde Wu, Jiayuan Zhu, Yunli Qi et al.

We introduce a novel graph-based Retrieval-Augmented Generation (RAG) framework specifically designed for the medical domain, called \textbf{MedGraphRAG}, aimed at enhancing Large Language Model (LLM) capabilities for generating evidence-based medical responses, thereby improving safety and reliability when handling private medical data. Graph-based RAG (GraphRAG) leverages LLMs to organize RAG data into graphs, showing strong potential for gaining holistic insights from long-form documents. However, its standard implementation is overly complex for general use and lacks the ability to generate evidence-based responses, limiting its effectiveness in the medical field. To extend the capabilities of GraphRAG to the medical domain, we propose unique Triple Graph Construction and U-Retrieval techniques over it. In our graph construction, we create a triple-linked structure that connects user documents to credible medical sources and controlled vocabularies. In the retrieval process, we propose U-Retrieval which combines Top-down Precise Retrieval with Bottom-up Response Refinement to balance global context awareness with precise indexing. These effort enable both source information retrieval and comprehensive response generation. Our approach is validated on 9 medical Q\&A benchmarks, 2 health fact-checking benchmarks, and one collected dataset testing long-form generation. The results show that MedGraphRAG consistently outperforms state-of-the-art models across all benchmarks, while also ensuring that responses include credible source documentation and definitions. Our code is released at: https://github.com/MedicineToken/Medical-Graph-RAG.

IVJul 20, 2023
Modeling 3D cardiac contraction and relaxation with point cloud deformation networks

Marcel Beetz, Abhirup Banerjee, Vicente Grau · oxford

Global single-valued biomarkers of cardiac function typically used in clinical practice, such as ejection fraction, provide limited insight on the true 3D cardiac deformation process and hence, limit the understanding of both healthy and pathological cardiac mechanics. In this work, we propose the Point Cloud Deformation Network (PCD-Net) as a novel geometric deep learning approach to model 3D cardiac contraction and relaxation between the extreme ends of the cardiac cycle. It employs the recent advances in point cloud-based deep learning into an encoder-decoder structure, in order to enable efficient multi-scale feature learning directly on multi-class 3D point cloud representations of the cardiac anatomy. We evaluate our approach on a large dataset of over 10,000 cases from the UK Biobank study and find average Chamfer distances between the predicted and ground truth anatomies below the pixel resolution of the underlying image acquisition. Furthermore, we observe similar clinical metrics between predicted and ground truth populations and show that the PCD-Net can successfully capture subpopulation-specific differences between normal subjects and myocardial infarction (MI) patients. We then demonstrate that the learned 3D deformation patterns outperform multiple clinical benchmarks by 13% and 7% in terms of area under the receiver operating characteristic curve for the tasks of prevalent MI detection and incident MI prediction and by 7% in terms of Harrell's concordance index for MI survival analysis.

CVJul 14, 2023
3D Shape-Based Myocardial Infarction Prediction Using Point Cloud Classification Networks

Marcel Beetz, Yilong Yang, Abhirup Banerjee et al. · oxford

Myocardial infarction (MI) is one of the most prevalent cardiovascular diseases with associated clinical decision-making typically based on single-valued imaging biomarkers. However, such metrics only approximate the complex 3D structure and physiology of the heart and hence hinder a better understanding and prediction of MI outcomes. In this work, we investigate the utility of complete 3D cardiac shapes in the form of point clouds for an improved detection of MI events. To this end, we propose a fully automatic multi-step pipeline consisting of a 3D cardiac surface reconstruction step followed by a point cloud classification network. Our method utilizes recent advances in geometric deep learning on point clouds to enable direct and efficient multi-scale learning on high-resolution surface models of the cardiac anatomy. We evaluate our approach on 1068 UK Biobank subjects for the tasks of prevalent MI detection and incident MI prediction and find improvements of ~13% and ~5% respectively over clinical benchmarks. Furthermore, we analyze the role of each ventricle and cardiac phase for 3D shape-based MI detection and conduct a visual analysis of the morphological and physiological patterns typically associated with MI outcomes.

CVJul 20, 2023
Multi-objective point cloud autoencoders for explainable myocardial infarction prediction

Marcel Beetz, Abhirup Banerjee, Vicente Grau · oxford

Myocardial infarction (MI) is one of the most common causes of death in the world. Image-based biomarkers commonly used in the clinic, such as ejection fraction, fail to capture more complex patterns in the heart's 3D anatomy and thus limit diagnostic accuracy. In this work, we present the multi-objective point cloud autoencoder as a novel geometric deep learning approach for explainable infarction prediction, based on multi-class 3D point cloud representations of cardiac anatomy and function. Its architecture consists of multiple task-specific branches connected by a low-dimensional latent space to allow for effective multi-objective learning of both reconstruction and MI prediction, while capturing pathology-specific 3D shape information in an interpretable latent space. Furthermore, its hierarchical branch design with point cloud-based deep learning operations enables efficient multi-scale feature learning directly on high-resolution anatomy point clouds. In our experiments on a large UK Biobank dataset, the multi-objective point cloud autoencoder is able to accurately reconstruct multi-temporal 3D shapes with Chamfer distances between predicted and input anatomies below the underlying images' pixel resolution. Our method outperforms multiple machine learning and deep learning benchmarks for the task of incident MI prediction by 19% in terms of Area Under the Receiver Operating Characteristic curve. In addition, its task-specific compact latent space exhibits easily separable control and MI clusters with clinically plausible associations between subject encodings and corresponding 3D shapes, thus demonstrating the explainability of the prediction.

SPJul 10, 2023
Towards Enabling Cardiac Digital Twins of Myocardial Infarction Using Deep Computational Models for Inverse Inference

Lei Li, Julia Camps, Zhinuo et al.

Cardiac digital twins (CDTs) have the potential to offer individualized evaluation of cardiac function in a non-invasive manner, making them a promising approach for personalized diagnosis and treatment planning of my-ocardial infarction (MI). The inference of accurate myocardial tissue properties is crucial in creating a reliable CDT of MI. In this work, we investigate the feasibility of inferring myocardial tissue properties from the electrocardiogram (ECG) within a CDT platform. The platform integrates multi-modal data, such as cardiac MRI and ECG, to enhance the accuracy and reliability of the inferred tissue properties. We perform a sensitivity analysis based on computer simulations, systematically exploring the effects of infarct location, size, degree of transmurality, and electrical ac-tivity alteration on the simulated QRS complex of ECG, to establish the limits of the approach. We subsequently present a novel deep computational model, comprising a dual-branch variational autoencoder and an inference model, to infer infarct location and distribution from the simulated QRS. The proposed model achieves mean Dice scores of 0.457 \pm 0.317 and 0.302 \pm 0.273 for the inference of left ventricle scars and border zone, respectively. The sensitivity analysis enhances our understanding of the complex relationship between infarct characteristics and electrophysiological features. The in silico experimental results show that the model can effectively capture the relationship for the inverse inference, with promising potential for clinical application in the future. The code will be released publicly once the manuscript is accepted for publication.

IVAug 26, 2022
Multi-Modality Cardiac Image Computing: A Survey

Lei Li, Wangbin Ding, Liqun Huang et al.

Multi-modality cardiac imaging plays a key role in the management of patients with cardiovascular diseases. It allows a combination of complementary anatomical, morphological and functional information, increases diagnosis accuracy, and improves the efficacy of cardiovascular interventions and clinical outcomes. Fully-automated processing and quantitative analysis of multi-modality cardiac images could have a direct impact on clinical research and evidence-based patient management. However, these require overcoming significant challenges including inter-modality misalignment and finding optimal methods to integrate information from different modalities. This paper aims to provide a comprehensive review of multi-modality imaging in cardiology, the computing methods, the validation strategies, the related clinical workflows and future perspectives. For the computing methodologies, we have a favored focus on the three tasks, i.e., registration, fusion and segmentation, which generally involve multi-modality imaging data, \textit{either combining information from different modalities or transferring information across modalities}. The review highlights that multi-modality cardiac imaging data has the potential of wide applicability in the clinic, such as trans-aortic valve implantation guidance, myocardial viability assessment, and catheter ablation therapy and its patient selection. Nevertheless, many challenges remain unsolved, such as missing modality, combination of imaging and non-imaging data, and uniform analysis and representation of different modalities. There is also work to do in defining how the well-developed techniques fit in clinical workflows and how much additional and relevant information they introduce. These problems are likely to continue to be an active field of research and the questions to be answered in the future.

IVAug 25, 2024Code
Personalized Topology-Informed Localization of Standard 12-Lead ECG Electrode Placement from Incomplete Cardiac MRIs for Efficient Cardiac Digital Twins

Lei Li, Hannah Smith, Yilin Lyu et al.

Cardiac digital twins (CDTs) offer personalized in-silico cardiac representations for the inference of multi-scale properties tied to cardiac mechanisms. The creation of CDTs requires precise information about the electrode position on the torso, especially for the personalized electrocardiogram (ECG) calibration. However, current studies commonly rely on additional acquisition of torso imaging and manual/semi-automatic methods for ECG electrode localization. In this study, we propose a novel and efficient topology-informed model to fully automatically extract personalized ECG standard electrode locations from 2D clinically standard cardiac MRIs. Specifically, we obtain the sparse torso contours from the cardiac MRIs and then localize the standard electrodes of 12-lead ECG from the contours. Cardiac MRIs aim at imaging of the heart instead of the torso, leading to incomplete torso geometry within the imaging. To tackle the missing topology, we incorporate the electrodes as a subset of the keypoints, which can be explicitly aligned with the 3D torso topology. The experimental results demonstrate that the proposed model outperforms the time-consuming conventional model projection-based method in terms of accuracy (Euclidean distance: $1.24 \pm 0.293$ cm vs. $1.48 \pm 0.362$ cm) and efficiency ($2$~s vs. $30$-$35$~min). We further demonstrate the effectiveness of using the detected electrodes for in-silico ECG simulation, highlighting their potential for creating accurate and efficient CDT models. The code is available at https://github.com/lileitech/12lead_ECG_electrode_localizer.

CVApr 4, 2023
Influence of Myocardial Infarction on QRS Properties: A Simulation Study

Lei Li, Julia Camps, Zhinuo et al.

The interplay between structural and electrical changes in the heart after myocardial infarction (MI) plays a key role in the initiation and maintenance of arrhythmia. The anatomical and electrophysiological properties of scar, border zone, and normal myocardium modify the electrocardiographic morphology, which is routinely analysed in clinical settings. However, the influence of various MI properties on the QRS is not intuitively predictable.In this work, we have systematically investigated the effects of 17 post-MI scenarios, varying the location, size, transmural extent, and conductive level of scarring and border zone area, on the forward-calculated QRS. Additionally, we have compared the contributions of different QRS score criteria for quantifying post-MI pathophysiology.The propagation of electrical activity in the ventricles is simulated via a Eikonal model on a unified coordinate system.The analysis has been performed on 49 subjects, and the results imply that the QRS is capable of identifying MI, suggesting the feasibility of inversely reconstructing infarct regions from QRS.There exist sensitivity variations of different QRS criteria for identifying 17 MI scenarios, which is informative for solving the inverse problem.

IVJul 19, 2024
DeepCA: Deep Learning-based 3D Coronary Artery Tree Reconstruction from Two 2D Non-simultaneous X-ray Angiography Projections

Yiying Wang, Abhirup Banerjee, Robin P. Choudhury et al. · oxford

Cardiovascular diseases (CVDs) are the most common cause of death worldwide. Invasive x-ray coronary angiography (ICA) is one of the most important imaging modalities for the diagnosis of CVDs. ICA typically acquires only two 2D projections, which makes the 3D geometry of coronary vessels difficult to interpret, thus requiring 3D coronary artery tree reconstruction from two projections. State-of-the-art approaches require significant manual interactions and cannot correct the non-rigid cardiac and respiratory motions between non-simultaneous projections. In this study, we propose a novel deep learning pipeline named \emph{DeepCA}. We leverage the Wasserstein conditional generative adversarial network with gradient penalty, latent convolutional transformer layers, and a dynamic snake convolutional critic to implicitly compensate for the non-rigid motion and provide 3D coronary artery tree reconstruction. Through simulating projections from coronary computed tomography angiography (CCTA), we achieve the generalisation of 3D coronary tree reconstruction on real non-simultaneous ICA projections. We incorporate an application-specific evaluation metric to validate our proposed model on both a CCTA dataset and a real ICA dataset, together with Chamfer $\ell_2$ distance. The results demonstrate promising performance of our DeepCA model in vessel topology preservation, recovery of missing features, and generalisation ability to real ICA data. To the best of our knowledge, this is the first study that leverages deep learning to achieve 3D coronary tree reconstruction from two real non-simultaneous x-ray angiographic projections.

IVSep 6, 2024
NeCA: 3D Coronary Artery Tree Reconstruction from Two 2D Projections via Neural Implicit Representation

Yiying Wang, Abhirup Banerjee, Vicente Grau · oxford

Cardiovascular diseases (CVDs) are the most common health threats worldwide. 2D X-ray invasive coronary angiography (ICA) remains the most widely adopted imaging modality for CVD assessment during real-time cardiac interventions. However, it is often difficult for cardiologists to interpret the 3D geometry of coronary vessels based on 2D planes. Moreover, due to the radiation limit, often only two angiographic projections are acquired, providing limited information of the vessel geometry and necessitating 3D coronary tree reconstruction based only on two ICA projections. In this paper, we propose a self-supervised deep learning method called NeCA, which is based on neural implicit representation using the multiresolution hash encoder and differentiable cone-beam forward projector layer, in order to achieve 3D coronary artery tree reconstruction from two 2D projections. We validate our method using six different metrics on a dataset generated from coronary computed tomography angiography of right coronary artery and left anterior descending artery. The evaluation results demonstrate that our NeCA method, without requiring 3D ground truth for supervision or large datasets for training, achieves promising performance in both vessel topology and branch-connectivity preservation compared to the supervised deep learning model.

IVMar 14Code
EchoLVFM: One-Step Video Generation via Latent Flow Matching for Echocardiogram Synthesis

Emmanuel Oladokun, Sarina Thomas, Jurica Šprem et al.

Echocardiography is widely used for assessing cardiac function, where clinically meaningful parameters such as left-ventricular ejection fraction (EF) play a central role in diagnosis and management. Generative models capable of synthesising realistic echocardiogram videos with explicit control over such parameters are valuable for data augmentation, counterfactual analysis, and specialist training. However, existing approaches typically rely on computationally expensive multi-step sampling and aggressive temporal normalisation, limiting efficiency and applicability to heterogeneous real-world data. We introduce EchoLVFM, a one-step latent video flow-matching framework for controllable echocardiogram generation. Operating in the latent space, EchoLVFM synthesises temporally coherent videos in a single inference step, achieving a $\mathbf{\sim 50\times}$ improvement in sampling efficiency compared to multi-step flow baselines while maintaining visual fidelity. The model supports global conditioning on clinical variables, demonstrated through precise control of EF, and enables reconstruction and counterfactual generation from partially observed sequences. A masked conditioning strategy further removes fixed-length constraints, allowing shorter sequences to be retained rather than discarded. We evaluate EchoLVFM on the CAMUS dataset under challenging single-frame conditioning. Quantitative and qualitative results demonstrate competitive video quality, strong EF adherence, and 57.9% discrimination accuracy by expert clinicians which is close to chance. These findings indicate that efficient, one-step flow matching can enable practical, controllable echocardiogram video synthesis without sacrificing fidelity. Code available at: https://github.com/EngEmmanuel/EchoLVFM

CVApr 15, 2025Code
From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation

Jingkun Chen, Haoran Duan, Xiao Zhang et al.

Medical image segmentation remains challenging due to the high cost of pixel-level annotations for training. In the context of weak supervision, clinician gaze data captures regions of diagnostic interest; however, its sparsity limits its use for segmentation. In contrast, vision-language models (VLMs) provide semantic context through textual descriptions but lack the explanation precision required. Recognizing that neither source alone suffices, we propose a teacher-student framework that integrates both gaze and language supervision, leveraging their complementary strengths. Our key insight is that gaze data indicates where clinicians focus during diagnosis, while VLMs explain why those regions are significant. To implement this, the teacher model first learns from gaze points enhanced by VLM-generated descriptions of lesion morphology, establishing a foundation for guiding the student model. The teacher then directs the student through three strategies: (1) Multi-scale feature alignment to fuse visual cues with textual semantics; (2) Confidence-weighted consistency constraints to focus on reliable predictions; (3) Adaptive masking to limit error propagation in uncertain areas. Experiments on the Kvasir-SEG, NCI-ISBI, and ISIC datasets show that our method achieves Dice scores of 80.78%, 80.53%, and 84.22%, respectively-improving 3-5% over gaze baselines without increasing the annotation burden. By preserving correlations among predictions, gaze data, and lesion descriptions, our framework also maintains clinical interpretability. This work illustrates how integrating human visual attention with AI-generated semantic context can effectively overcome the limitations of individual weak supervision signals, thereby advancing the development of deployable, annotation-efficient medical AI systems. Code is available at: https://github.com/jingkunchen/FGI.

CVOct 1, 2025Code
AortaDiff: A Unified Multitask Diffusion Framework For Contrast-Free AAA Imaging

Yuxuan Ou, Ning Bi, Jiazhen Pan et al.

While contrast-enhanced CT (CECT) is standard for assessing abdominal aortic aneurysms (AAA), the required iodinated contrast agents pose significant risks, including nephrotoxicity, patient allergies, and environmental harm. To reduce contrast agent use, recent deep learning methods have focused on generating synthetic CECT from non-contrast CT (NCCT) scans. However, most adopt a multi-stage pipeline that first generates images and then performs segmentation, which leads to error accumulation and fails to leverage shared semantic and anatomical structures. To address this, we propose a unified deep learning framework that generates synthetic CECT images from NCCT scans while simultaneously segmenting the aortic lumen and thrombus. Our approach integrates conditional diffusion models (CDM) with multi-task learning, enabling end-to-end joint optimization of image synthesis and anatomical segmentation. Unlike previous multitask diffusion models, our approach requires no initial predictions (e.g., a coarse segmentation mask), shares both encoder and decoder parameters across tasks, and employs a semi-supervised training strategy to learn from scans with missing segmentation labels, a common constraint in real-world clinical data. We evaluated our method on a cohort of 264 patients, where it consistently outperformed state-of-the-art single-task and multi-stage models. For image synthesis, our model achieved a PSNR of 25.61 dB, compared to 23.80 dB from a single-task CDM. For anatomical segmentation, it improved the lumen Dice score to 0.89 from 0.87 and the challenging thrombus Dice score to 0.53 from 0.48 (nnU-Net). These segmentation enhancements led to more accurate clinical measurements, reducing the lumen diameter MAE to 4.19 mm from 5.78 mm and the thrombus area error to 33.85% from 41.45% when compared to nnU-Net. Code is available at https://github.com/yuxuanou623/AortaDiff.git.

QMMar 15, 2024
Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction

Chen Chen, Lei Li, Marcel Beetz et al. · oxford

Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual-attention ECG network designed to capture complex ECG features essential for early HF risk prediction, despite the notable imbalance between low and high-risk groups. This network incorporates a cross-lead attention module and twelve lead-specific temporal attention modules, focusing on cross-lead interactions and each lead's local dynamics. To further alleviate model overfitting, we leverage a large language model (LLM) with a public ECG-Report dataset for pretraining on an ECG-report alignment task. The network is then fine-tuned for HF risk prediction using two specific cohorts from the UK Biobank study, focusing on patients with hypertension (UKB-HYP) and those who have had a myocardial infarction (UKB-MI).The results reveal that LLM-informed pre-training substantially enhances HF risk prediction in these cohorts. The dual-attention design not only improves interpretability but also predictive accuracy, outperforming existing competitive methods with C-index scores of 0.6349 for UKB-HYP and 0.5805 for UKB-MI. This demonstrates our method's potential in advancing HF risk assessment with clinical complex ECG data.

CVJan 29, 2025
Unsupervised Patch-GAN with Targeted Patch Ranking for Fine-Grained Novelty Detection in Medical Imaging

Jingkun Chen, Guang Yang, Xiao Zhang et al.

Detecting novel anomalies in medical imaging is challenging due to the limited availability of labeled data for rare abnormalities, which often display high variability and subtlety. This challenge is further compounded when small abnormal regions are embedded within larger normal areas, as whole-image predictions frequently overlook these subtle deviations. To address these issues, we propose an unsupervised Patch-GAN framework designed to detect and localize anomalies by capturing both local detail and global structure. Our framework first reconstructs masked images to learn fine-grained, normal-specific features, allowing for enhanced sensitivity to minor deviations from normality. By dividing these reconstructed images into patches and assessing the authenticity of each patch, our approach identifies anomalies at a more granular level, overcoming the limitations of whole-image evaluation. Additionally, a patch-ranking mechanism prioritizes regions with higher abnormal scores, reinforcing the alignment between local patch discrepancies and the global image context. Experimental results on the ISIC 2016 skin lesion and BraTS 2019 brain tumor datasets validate our framework's effectiveness, achieving AUCs of 95.79% and 96.05%, respectively, and outperforming three state-of-the-art baselines.

IVJan 9, 2025
Contrast-Free Myocardial Scar Segmentation in Cine MRI using Motion and Texture Fusion

Guang Yang, Jingkun Chen, Xicheng Sheng et al.

Late gadolinium enhancement MRI (LGE MRI) is the gold standard for the detection of myocardial scars for post myocardial infarction (MI). LGE MRI requires the injection of a contrast agent, which carries potential side effects and increases scanning time and patient discomfort. To address these issues, we propose a novel framework that combines cardiac motion observed in cine MRI with image texture information to segment the myocardium and scar tissue in the left ventricle. Cardiac motion tracking can be formulated as a full cardiac image cycle registration problem, which can be solved via deep neural networks. Experimental results prove that the proposed method can achieve scar segmentation based on non-contrasted cine images with comparable accuracy to LGE MRI. This demonstrates its potential as an alternative to contrast-enhanced techniques for scar detection.

MED-PHDec 21, 2023
Anatomical basis of sex differences in the electrocardiogram identified by three-dimensional torso-heart imaging reconstruction pipeline

Hannah J. Smith, Blanca Rodriguez, Yuling Sang et al. · oxford

The electrocardiogram (ECG) is used for diagnosis and risk stratification following myocardial infarction (MI). Women have a higher incidence of missed MI diagnosis and complications following infarction, and to address this we aim to provide quantitative information on sex-differences in ECG and torso-ventricular anatomy features. A novel computational automated pipeline is presented enabling the three-dimensional reconstruction of torso-ventricular anatomies for 425 post-MI subjects and 1051 healthy controls from UK Biobank clinical images. Regression models were created relating torso-ventricular and ECG parameters. For post-MI women, the heart is positioned more posteriorly and vertically, than in men (with healthy women yet more vertical). Post-MI women exhibit less QRS prolongation, requiring 27% more prolongation than men to exceed 120ms. Only half of the sex difference in QRS is associated with smaller female cavities. Lower STj amplitude in women is striking, associated with smaller ventricles, but also more superior and posterior cardiac position. Post-MI, T wave amplitude and R axis deviations are strongly associated with a more posterior and horizontal cardiac position in women (but not in men). Our study highlights the need to quantify sex differences in anatomical features, their implications in ECG interpretation, and the application of clinical ECG thresholds in post-MI.

CVNov 18, 2025
Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from Late Gadolinium-Enhanced Images

Farheen Ramzan, Yusuf Kiberu, Nikesh Jathanna et al.

Accurate segmentation of myocardial scar from late gadolinium enhanced (LGE) cardiac MRI is essential for evaluating tissue viability, yet remains challenging due to variable contrast and imaging artifacts. Electrocardiogram (ECG) signals provide complementary physiological information, as conduction abnormalities can help localize or suggest scarred myocardial regions. In this work, we propose a novel multimodal framework that integrates ECG-derived electrophysiological information with anatomical priors from the AHA-17 atlas for physiologically consistent LGE-based scar segmentation. As ECGs and LGE-MRIs are not acquired simultaneously, we introduce a Temporal Aware Feature Fusion (TAFF) mechanism that dynamically weights and fuses features based on their acquisition time difference. Our method was evaluated on a clinical dataset and achieved substantial gains over the state-of-the-art image-only baseline (nnU-Net), increasing the average Dice score for scars from 0.6149 to 0.8463 and achieving high performance in both precision (0.9115) and sensitivity (0.9043). These results show that integrating physiological and anatomical knowledge allows the model to "see beyond the image", setting a new direction for robust and physiologically grounded cardiac scar segmentation.

CVNov 21, 2025
AngioDG: Interpretable Channel-informed Feature-modulated Single-source Domain Generalization for Coronary Vessel Segmentation in X-ray Angiography

Mohammad Atwany, Mojtaba Lashgari, Robin P. Choudhury et al.

Cardiovascular diseases are the leading cause of death globally, with X-ray Coronary Angiography (XCA) as the gold standard during real-time cardiac interventions. Segmentation of coronary vessels from XCA can facilitate downstream quantitative assessments, such as measurement of the stenosis severity and enhancing clinical decision-making. However, developing generalizable vessel segmentation models for XCA is challenging due to variations in imaging protocols and patient demographics that cause domain shifts. These limitations are exacerbated by the lack of annotated datasets, making Single-source Domain Generalization (SDG) a necessary solution for achieving generalization. Existing SDG methods are largely augmentation-based, which may not guarantee the mitigation of overfitting to augmented or synthetic domains. We propose a novel approach, ``AngioDG", to bridge this gap by channel regularization strategy to promote generalization. Our method identifies the contributions of early feature channels to task-specific metrics for DG, facilitating interpretability, and then reweights channels to calibrate and amplify domain-invariant features while attenuating domain-specific ones. We evaluate AngioDG on 6 x-ray angiography datasets for coronary vessels segmentation, achieving the best out-of-distribution performance among the compared methods, while maintaining consistent in-domain test performance.

IVAug 18, 2025
3D Cardiac Anatomy Generation Using Mesh Latent Diffusion Models

Jolanta Mozyrska, Marcel Beetz, Luke Melas-Kyriazi et al.

Diffusion models have recently gained immense interest for their generative capabilities, specifically the high quality and diversity of the synthesized data. However, examples of their applications in 3D medical imaging are still scarce, especially in cardiology. Generating diverse realistic cardiac anatomies is crucial for applications such as in silico trials, electromechanical computer simulations, or data augmentations for machine learning models. In this work, we investigate the application of Latent Diffusion Models (LDMs) for generating 3D meshes of human cardiac anatomies. To this end, we propose a novel LDM architecture -- MeshLDM. We apply the proposed model on a dataset of 3D meshes of left ventricular cardiac anatomies from patients with acute myocardial infarction and evaluate its performance in terms of both qualitative and quantitative clinical and 3D mesh reconstruction metrics. The proposed MeshLDM successfully captures characteristics of the cardiac shapes at end-diastolic (relaxation) and end-systolic (contraction) cardiac phases, generating meshes with a 2.4% difference in population mean compared to the gold standard.

IVAug 18, 2025
From Transthoracic to Transesophageal: Cross-Modality Generation using LoRA Diffusion

Emmanuel Oladokun, Yuxuan Ou, Anna Novikova et al.

Deep diffusion models excel at realistic image synthesis but demand large training sets-an obstacle in data-scarce domains like transesophageal echocardiography (TEE). While synthetic augmentation has boosted performance in transthoracic echo (TTE), TEE remains critically underrepresented, limiting the reach of deep learning in this high-impact modality. We address this gap by adapting a TTE-trained, mask-conditioned diffusion backbone to TEE with only a limited number of new cases and adapters as small as $10^5$ parameters. Our pipeline combines Low-Rank Adaptation with MaskR$^2$, a lightweight remapping layer that aligns novel mask formats with the pretrained model's conditioning channels. This design lets users adapt models to new datasets with a different set of anatomical structures to the base model's original set. Through a targeted adaptation strategy, we find that adapting only MLP layers suffices for high-fidelity TEE synthesis. Finally, mixing less than 200 real TEE frames with our synthetic echoes improves the dice score on a multiclass segmentation task, particularly boosting performance on underrepresented right-heart structures. Our results demonstrate that (1) semantically controlled TEE images can be generated with low overhead, (2) MaskR$^2$ effectively transforms unseen mask formats into compatible formats without damaging downstream task performance, and (3) our method generates images that are effective for improving performance on a downstream task of multiclass segmentation.

IVJul 21, 2025
Personalized 4D Whole Heart Geometry Reconstruction from Cine MRI for Cardiac Digital Twins

Xiaoyue Liu, Xicheng Sheng, Xiahai Zhuang et al.

Cardiac digital twins (CDTs) provide personalized in-silico cardiac representations and hold great potential for precision medicine in cardiology. However, whole-heart CDT models that simulate the full organ-scale electromechanics of all four heart chambers remain limited. In this work, we propose a weakly supervised learning model to reconstruct 4D (3D+t) heart mesh directly from multi-view 2D cardiac cine MRIs. This is achieved by learning a self-supervised mapping between cine MRIs and 4D cardiac meshes, enabling the generation of personalized heart models that closely correspond to input cine MRIs. The resulting 4D heart meshes can facilitate the automatic extraction of key cardiac variables, including ejection fraction and dynamic chamber volume changes with high temporal resolution. It demonstrates the feasibility of inferring personalized 4D heart models from cardiac MRIs, paving the way for an efficient CDT platform for precision medicine. The code will be publicly released once the manuscript is accepted.

CVJun 17, 2024
Solving the Inverse Problem of Electrocardiography for Cardiac Digital Twins: A Survey

Lei Li, Julia Camps, Blanca Rodriguez et al.

Cardiac digital twins (CDTs) are personalized virtual representations used to understand complex cardiac mechanisms. A critical component of CDT development is solving the ECG inverse problem, which enables the reconstruction of cardiac sources and the estimation of patient-specific electrophysiology (EP) parameters from surface ECG data. Despite challenges from complex cardiac anatomy, noisy ECG data, and the ill-posed nature of the inverse problem, recent advances in computational methods have greatly improved the accuracy and efficiency of ECG inverse inference, strengthening the fidelity of CDTs. This paper aims to provide a comprehensive review of the methods of solving ECG inverse problem, the validation strategies, the clinical applications, and future perspectives. For the methodologies, we broadly classify state-of-the-art approaches into two categories: deterministic and probabilistic methods, including both conventional and deep learning-based techniques. Integrating physics laws with deep learning models holds promise, but challenges such as capturing dynamic electrophysiology accurately, accessing accurate domain knowledge, and quantifying prediction uncertainty persist. Integrating models into clinical workflows while ensuring interpretability and usability for healthcare professionals is essential. Overcoming these challenges will drive further research in CDTs.

IVMar 2, 2020
A Deep learning Approach to Generate Contrast-Enhanced Computerised Tomography Angiography without the Use of Intravenous Contrast Agents

Anirudh Chandrashekar, Ashok Handa, Natesh Shivakumar et al.

Contrast-enhanced computed tomography angiograms (CTAs) are widely used in cardiovascular imaging to obtain a non-invasive view of arterial structures. However, contrast agents are associated with complications at the injection site as well as renal toxicity leading to contrast-induced nephropathy (CIN) and renal failure. We hypothesised that the raw data acquired from a non-contrast CT contains sufficient information to differentiate blood and other soft tissue components. We utilised deep learning methods to define the subtleties between soft tissue components in order to simulate contrast enhanced CTAs without contrast agents. Twenty-six patients with paired non-contrast and CTA images were randomly selected from an approved clinical study. Non-contrast axial slices within the AAA from 10 patients (n = 100) were sampled for the underlying Hounsfield unit (HU) distribution at the lumen, intra-luminal thrombus and interface locations. Sampling of HUs in these regions revealed significant differences between all regions (p<0.001 for all comparisons), confirming the intrinsic differences in the radiomic signatures between these regions. To generate a large training dataset, paired axial slices from the training set (n=13) were augmented to produce a total of 23,551 2-D images. We trained a 2-D Cycle Generative Adversarial Network (cycleGAN) for this non-contrast to contrast (NC2C) transformation task. The accuracy of the cycleGAN output was assessed by comparison to the contrast image. This pipeline is able to differentiate between visually incoherent soft tissue regions in non-contrast CT images. The CTAs generated from the non-contrast images bear strong resemblance to the ground truth. Here we describe a novel application of Generative Adversarial Network for CT image processing. This is poised to disrupt clinical pathways requiring contrast enhanced CT imaging.

IVFeb 9, 2020
A Deep Learning Approach to Automate High-Resolution Blood Vessel Reconstruction on Computerized Tomography Images With or Without the Use of Contrast Agent

Anirudh Chandrashekar, Ashok Handa, Natesh Shivakumar et al.

Existing methods to reconstruct vascular structures from a computed tomography (CT) angiogram rely on injection of intravenous contrast to enhance the radio-density within the vessel lumen. However, pathological changes can be present in the blood lumen, vessel wall or a combination of both that prevent accurate reconstruction. In the example of aortic aneurysmal disease, a blood clot or thrombus adherent to the aortic wall within the expanding aneurysmal sac is present in 70-80% of cases. These deformations prevent the automatic extraction of vital clinically relevant information by current methods. In this study, we implemented a modified U-Net architecture with attention-gating to establish a high-throughput and automated segmentation pipeline of pathological blood vessels in CT images acquired with or without the use of a contrast agent. Twenty-six patients with paired non-contrast and contrast-enhanced CT images within the ongoing Oxford Abdominal Aortic Aneurysm (OxAAA) study were randomly selected, manually annotated and used for model training and evaluation (13/13). Data augmentation methods were implemented to diversify the training data set in a ratio of 10:1. The performance of our Attention-based U-Net in extracting both the inner lumen and the outer wall of the aortic aneurysm from CT angiograms (CTA) was compared against a generic 3-D U-Net and displayed superior results. Subsequent implementation of this network architecture within the aortic segmentation pipeline from both contrast-enhanced CTA and non-contrast CT images has allowed for accurate and efficient extraction of the entire aortic volume. This extracted volume can be used to standardize current methods of aneurysmal disease management and sets the foundation for subsequent complex geometric and morphological analysis. Furthermore, the proposed pipeline can be extended to other vascular pathologies.

IVSep 16, 2019
Efficient 3D Fully Convolutional Networks for Pulmonary Lobe Segmentation in CT Images

Hoileong Lee, Tahreema Matin, Fergus Gleeson et al.

The human lung is a complex respiratory organ, consisting of five distinct anatomic compartments called lobes. Accurate and automatic segmentation of these pulmonary lobes from computed tomography (CT) images is of clinical importance for lung disease assessment and treatment planning. However, this task is challenging due to ambiguous lobar boundaries, anatomical variations and pathological deformations. In this paper, we propose a high-resolution and efficient 3D fully convolutional network to automatically segment the lobes. We refer to the network as Pulmonary Lobe Segmentation Network (PLS-Net), which is designed to efficiently exploit 3D spatial and contextual information from high-resolution volumetric CT images for effective volume-to-volume learning and inference. The PLS-Net is based on an asymmetric encoder-decoder architecture with three novel components: (i) 3D depthwise separable convolutions to improve the network efficiency by factorising each regular 3D convolution into two simpler operations; (ii) dilated residual dense blocks to efficiently expand the receptive field of the network and aggregate multi-scale contextual information for segmentation; and (iii) input reinforcement at each downsampled resolution to compensate for the loss of spatial information due to convolutional and downsampling operations. We evaluated the proposed PLS-Net on a multi-institutional dataset that consists of 210 CT images acquired from patients with a wide range of lung abnormalities. Experimental results show that our PLS-Net achieves state-of-the-art performance with better computational efficiency. Further experiments confirm the effectiveness of each novel component of the PLS-Net.

CVMay 26, 2017
Extracting 3D Vascular Structures from Microscopy Images using Convolutional Recurrent Networks

Russell Bates, Benjamin Irving, Bostjan Markelc et al.

Vasculature is known to be of key biological significance, especially in the study of cancer. As such, considerable effort has been focused on the automated measurement and analysis of vasculature in medical and pre-clinical images. In tumors in particular, the vascular networks may be extremely irregular and the appearance of the individual vessels may not conform to classical descriptions of vascular appearance. Typically, vessels are extracted by either a segmentation and thinning pipeline, or by direct tracking. Neither of these methods are well suited to microscopy images of tumor vasculature. In order to address this we propose a method to directly extract a medial representation of the vessels using Convolutional Neural Networks. We then show that these two-dimensional centerlines can be meaningfully extended into 3D in anisotropic and complex microscopy images using the recently popularized Convolutional Long Short-Term Memory units (ConvLSTM). We demonstrate the effectiveness of this hybrid convolutional-recurrent architecture over both 2D and 3D convolutional comparators.