Marcel Breeuwer

IV
h-index54
16papers
324citations
Novelty40%
AI Score40

16 Papers

IVSep 9, 2022Code
Pathology Synthesis of 3D-Consistent Cardiac MR Images using 2D VAEs and GANs

Sina Amirrajab, Cristian Lorenz, Juergen Weese et al.

We propose a method for synthesizing cardiac magnetic resonance (MR) images with plausible heart pathologies and realistic appearances for the purpose of generating labeled data for the application of supervised deep-learning (DL) training. The image synthesis consists of label deformation and label-to-image translation tasks. The former is achieved via latent space interpolation in a VAE model, while the latter is accomplished via a label-conditional GAN model. We devise three approaches for label manipulation in the latent space of the trained VAE model; i) \textbf{intra-subject synthesis} aiming to interpolate the intermediate slices of a subject to increase the through-plane resolution, ii) \textbf{inter-subject synthesis} aiming to interpolate the geometry and appearance of intermediate images between two dissimilar subjects acquired with different scanner vendors, and iii) \textbf{pathology synthesis} aiming to synthesize a series of pseudo-pathological synthetic subjects with characteristics of a desired heart disease. Furthermore, we propose to model the relationship between 2D slices in the latent space of the VAE prior to reconstruction for generating 3D-consistent subjects from stacking up 2D slice-by-slice generations. We demonstrate that such an approach could provide a solution to diversify and enrich an available database of cardiac MR images and to pave the way for the development of generalizable DL-based image analysis algorithms. We quantitatively evaluate the quality of the synthesized data in an augmentation scenario to achieve generalization and robustness to multi-vendor and multi-disease data for image segmentation. Our code is available at https://github.com/sinaamirrajab/CardiacPathologySynthesis.

IVAug 9, 2022
sim2real: Cardiac MR Image Simulation-to-Real Translation via Unsupervised GANs

Sina Amirrajab, Yasmina Al Khalil, Cristian Lorenz et al.

There has been considerable interest in the MR physics-based simulation of a database of virtual cardiac MR images for the development of deep-learning analysis networks. However, the employment of such a database is limited or shows suboptimal performance due to the realism gap, missing textures, and the simplified appearance of simulated images. In this work we 1) provide image simulation on virtual XCAT subjects with varying anatomies, and 2) propose sim2real translation network to improve image realism. Our usability experiments suggest that sim2real data exhibits a good potential to augment training data and boost the performance of a segmentation algorithm.

MED-PHJun 5, 2023
A Deep Learning Approach Utilizing Covariance Matrix Analysis for the ISBI Edited MRS Reconstruction Challenge

Julian P. Merkofer, Dennis M. J. van de Sande, Sina Amirrajab et al.

This work proposes a method to accelerate the acquisition of high-quality edited magnetic resonance spectroscopy (MRS) scans using machine learning models taking the sample covariance matrix as input. The method is invariant to the number of transients and robust to noisy input data for both synthetic as well as in-vivo scenarios.

CVJan 16, 2025Code
Scaling up self-supervised learning for improved surgical foundation models

Tim J. M. Jaspers, Ronald L. P. D. de Jong, Yiping Li et al.

Foundation models have revolutionized computer vision by achieving vastly superior performance across diverse tasks through large-scale pretraining on extensive datasets. However, their application in surgical computer vision has been limited. This study addresses this gap by introducing SurgeNetXL, a novel surgical foundation model that sets a new benchmark in surgical computer vision. Trained on the largest reported surgical dataset to date, comprising over 4.7 million video frames, SurgeNetXL achieves consistent top-tier performance across six datasets spanning four surgical procedures and three tasks, including semantic segmentation, phase recognition, and critical view of safety (CVS) classification. Compared with the best-performing surgical foundation models, SurgeNetXL shows mean improvements of 2.4, 9.0, and 12.6 percent for semantic segmentation, phase recognition, and CVS classification, respectively. Additionally, SurgeNetXL outperforms the best-performing ImageNet-based variants by 14.4, 4.0, and 1.6 percent in the respective tasks. In addition to advancing model performance, this study provides key insights into scaling pretraining datasets, extending training durations, and optimizing model architectures specifically for surgical computer vision. These findings pave the way for improved generalizability and robustness in data-scarce scenarios, offering a comprehensive framework for future research in this domain. All models and a subset of the SurgeNetXL dataset, including over 2 million video frames, are publicly available at: https://github.com/TimJaspers0801/SurgeNet.

CVJun 2, 2025
SemiVT-Surge: Semi-Supervised Video Transformer for Surgical Phase Recognition

Yiping Li, Ronald de Jong, Sahar Nasirihaghighi et al.

Accurate surgical phase recognition is crucial for computer-assisted interventions and surgical video analysis. Annotating long surgical videos is labor-intensive, driving research toward leveraging unlabeled data for strong performance with minimal annotations. Although self-supervised learning has gained popularity by enabling large-scale pretraining followed by fine-tuning on small labeled subsets, semi-supervised approaches remain largely underexplored in the surgical domain. In this work, we propose a video transformer-based model with a robust pseudo-labeling framework. Our method incorporates temporal consistency regularization for unlabeled data and contrastive learning with class prototypes, which leverages both labeled data and pseudo-labels to refine the feature space. Through extensive experiments on the private RAMIE (Robot-Assisted Minimally Invasive Esophagectomy) dataset and the public Cholec80 dataset, we demonstrate the effectiveness of our approach. By incorporating unlabeled data, we achieve state-of-the-art performance on RAMIE with a 4.9% accuracy increase and obtain comparable results to full supervision while using only 1/4 of the labeled data on Cholec80. Our findings establish a strong benchmark for semi-supervised surgical phase recognition, paving the way for future research in this domain.

CVDec 5, 2024
Benchmarking and Enhancing Surgical Phase Recognition Models for Robotic-Assisted Esophagectomy

Yiping Li, Romy van Jaarsveld, Ronald de Jong et al.

Robotic-assisted minimally invasive esophagectomy (RAMIE) is a recognized treatment for esophageal cancer, offering better patient outcomes compared to open surgery and traditional minimally invasive surgery. RAMIE is highly complex, spanning multiple anatomical areas and involving repetitive phases and non-sequential phase transitions. Our goal is to leverage deep learning for surgical phase recognition in RAMIE to provide intraoperative support to surgeons. To achieve this, we have developed a new surgical phase recognition dataset comprising 27 videos. Using this dataset, we conducted a comparative analysis of state-of-the-art surgical phase recognition models. To more effectively capture the temporal dynamics of this complex procedure, we developed a novel deep learning model featuring an encoder-decoder structure with causal hierarchical attention, which demonstrates superior performance compared to existing models.

CVDec 4, 2024
Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy

Ronald L. P. D. de Jong, Yasmina al Khalil, Tim J. M. Jaspers et al.

Esophageal cancer is among the most common types of cancer worldwide. It is traditionally treated using open esophagectomy, but in recent years, robot-assisted minimally invasive esophagectomy (RAMIE) has emerged as a promising alternative. However, robot-assisted surgery can be challenging for novice surgeons, as they often suffer from a loss of spatial orientation. Computer-aided anatomy recognition holds promise for improving surgical navigation, but research in this area remains limited. In this study, we developed a comprehensive dataset for semantic segmentation in RAMIE, featuring the largest collection of vital anatomical structures and surgical instruments to date. Handling this diverse set of classes presents challenges, including class imbalance and the recognition of complex structures such as nerves. This study aims to understand the challenges and limitations of current state-of-the-art algorithms on this novel dataset and problem. Therefore, we benchmarked eight real-time deep learning models using two pretraining datasets. We assessed both traditional and attention-based networks, hypothesizing that attention-based networks better capture global patterns and address challenges such as occlusion caused by blood or other tissues. The benchmark includes our RAMIE dataset and the publicly available CholecSeg8k dataset, enabling a thorough assessment of surgical segmentation tasks. Our findings indicate that pretraining on ADE20k, a dataset for semantic segmentation, is more effective than pretraining on ImageNet. Furthermore, attention-based models outperform traditional convolutional neural networks, with SegNeXt and Mask2Former achieving higher Dice scores, and Mask2Former additionally excelling in average symmetric surface distance.

CVNov 18, 2025
SAM-Fed: SAM-Guided Federated Semi-Supervised Learning for Medical Image Segmentation

Sahar Nasirihaghighi, Negin Ghamsarian, Yiping Li et al.

Medical image segmentation is clinically important, yet data privacy and the cost of expert annotation limit the availability of labeled data. Federated semi-supervised learning (FSSL) offers a solution but faces two challenges: pseudo-label reliability depends on the strength of local models, and client devices often require compact or heterogeneous architectures due to limited computational resources. These constraints reduce the quality and stability of pseudo-labels, while large models, though more accurate, cannot be trained or used for routine inference on client devices. We propose SAM-Fed, a federated semi-supervised framework that leverages a high-capacity segmentation foundation model to guide lightweight clients during training. SAM-Fed combines dual knowledge distillation with an adaptive agreement mechanism to refine pixel-level supervision. Experiments on skin lesion and polyp segmentation across homogeneous and heterogeneous settings show that SAM-Fed consistently outperforms state-of-the-art FSSL methods.

LGJun 27, 2024
Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges

Mahmoud Ibrahim, Yasmina Al Khalil, Sina Amirrajab et al.

This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various generative models. Our search strategy queries databases such as Scopus, PubMed, and ArXiv, focusing on recent works from January 2021 to November 2023, excluding reviews and perspectives. This period emphasizes recent advancements beyond GANs, which have been extensively covered previously. The survey reveals insights from three key aspects: (1) Synthesis applications and purpose of synthesis, (2) generation techniques, and (3) evaluation methods. It highlights clinically valid synthesis applications, demonstrating the potential of synthetic data to tackle diverse clinical requirements. While conditional models incorporating class labels, segmentation masks and image translations are prevalent, there is a gap in utilizing prior clinical knowledge and patient-specific context, suggesting a need for more personalized synthesis approaches and emphasizing the importance of tailoring generative approaches to the unique characteristics of medical data. Additionally, there is a significant gap in using synthetic data beyond augmentation, such as for validation and evaluation of downstream medical AI models. The survey uncovers that the lack of standardized evaluation methodologies tailored to medical images is a barrier to clinical application, underscoring the need for in-depth evaluation approaches, benchmarking, and comparative studies to promote openness and collaboration.

IVSep 27, 2021
Optimized Automated Cardiac MR Scar Quantification with GAN-Based Data Augmentation

Didier R. P. R. M. Lustermans, Sina Amirrajab, Mitko Veta et al.

Background: The clinical utility of late gadolinium enhancement (LGE) cardiac MRI is limited by the lack of standardization, and time-consuming postprocessing. In this work, we tested the hypothesis that a cascaded deep learning pipeline trained with augmentation by synthetically generated data would improve model accuracy and robustness for automated scar quantification. Methods: A cascaded pipeline consisting of three consecutive neural networks is proposed, starting with a bounding box regression network to identify a region of interest around the left ventricular (LV) myocardium. Two further nnU-Net models are then used to segment the myocardium and, if present, scar. The models were trained on the data from the EMIDEC challenge, supplemented with an extensive synthetic dataset generated with a conditional GAN. Results: The cascaded pipeline significantly outperformed a single nnU-Net directly segmenting both the myocardium (mean Dice similarity coefficient (DSC) (standard deviation (SD)): 0.84 (0.09) vs 0.63 (0.20), p < 0.01) and scar (DSC: 0.72 (0.34) vs 0.46 (0.39), p < 0.01) on a per-slice level. The inclusion of the synthetic data as data augmentation during training improved the scar segmentation DSC by 0.06 (p < 0.01). The mean DSC per-subject on the challenge test set, for the cascaded pipeline augmented by synthetic generated data, was 0.86 (0.03) and 0.67 (0.29) for myocardium and scar, respectively. Conclusion: A cascaded deep learning-based pipeline trained with augmentation by synthetically generated data leads to myocardium and scar segmentations that are similar to the manual operator, and outperforms direct segmentation without the synthetic images.

IVNov 25, 2020
Physics-informed neural networks for myocardial perfusion MRI quantification

Rudolf L. M. van Herten, Amedeo Chiribiri, Marcel Breeuwer et al.

Tracer-kinetic models allow for the quantification of kinetic parameters such as blood flow from dynamic contrast-enhanced magnetic resonance (MR) images. Fitting the observed data with multi-compartment exchange models is desirable, as they are physiologically plausible and resolve directly for blood flow and microvascular function. However, the reliability of model fitting is limited by the low signal-to-noise ratio, temporal resolution, and acquisition length. This may result in inaccurate parameter estimates. This study introduces physics-informed neural networks (PINNs) as a means to perform myocardial perfusion MR quantification, which provides a versatile scheme for the inference of kinetic parameters. These neural networks can be trained to fit the observed perfusion MR data while respecting the underlying physical conservation laws described by a multi-compartment exchange model. Here, we provide a framework for the implementation of PINNs in myocardial perfusion MR. The approach is validated both in silico and in vivo. In the in silico study, an overall reduction in mean-squared error with the ground-truth parameters was observed compared to a standard non-linear least squares fitting approach. The in vivo study demonstrates that the method produces parameter values comparable to those previously found in literature, as well as providing parameter maps which match the clinical diagnosis of patients.

IVJul 27, 2020
XCAT-GAN for Synthesizing 3D Consistent Labeled Cardiac MR Images on Anatomically Variable XCAT Phantoms

Sina Amirrajab, Samaneh Abbasi-Sureshjani, Yasmina Al Khalil et al.

Generative adversarial networks (GANs) have provided promising data enrichment solutions by synthesizing high-fidelity images. However, generating large sets of labeled images with new anatomical variations remains unexplored. We propose a novel method for synthesizing cardiac magnetic resonance (CMR) images on a population of virtual subjects with a large anatomical variation, introduced using the 4D eXtended Cardiac and Torso (XCAT) computerized human phantom. We investigate two conditional image synthesis approaches grounded on a semantically-consistent mask-guided image generation technique: 4-class and 8-class XCAT-GANs. The 4-class technique relies on only the annotations of the heart; while the 8-class technique employs a predicted multi-tissue label map of the heart-surrounding organs and provides better guidance for our conditional image synthesis. For both techniques, we train our conditional XCAT-GAN with real images paired with corresponding labels and subsequently at the inference time, we substitute the labels with the XCAT derived ones. Therefore, the trained network accurately transfers the tissue-specific textures to the new label maps. By creating 33 virtual subjects of synthetic CMR images at the end-diastolic and end-systolic phases, we evaluate the usefulness of such data in the downstream cardiac cavity segmentation task under different augmentation strategies. Results demonstrate that even with only 20% of real images (40 volumes) seen during training, segmentation performance is retained with the addition of synthetic CMR images. Moreover, the improvement in utilizing synthetic images for augmenting the real data is evident through the reduction of Hausdorff distance up to 28% and an increase in the Dice score up to 5%, indicating a higher similarity to the ground truth in all dimensions.

IVFeb 17, 2020
4D Semantic Cardiac Magnetic Resonance Image Synthesis on XCAT Anatomical Model

Samaneh Abbasi-Sureshjani, Sina Amirrajab, Cristian Lorenz et al.

We propose a hybrid controllable image generation method to synthesize anatomically meaningful 3D+t labeled Cardiac Magnetic Resonance (CMR) images. Our hybrid method takes the mechanistic 4D eXtended CArdiac Torso (XCAT) heart model as the anatomical ground truth and synthesizes CMR images via a data-driven Generative Adversarial Network (GAN). We employ the state-of-the-art SPatially Adaptive De-normalization (SPADE) technique for conditional image synthesis to preserve the semantic spatial information of ground truth anatomy. Using the parameterized motion model of the XCAT heart, we generate labels for 25 time frames of the heart for one cardiac cycle at 18 locations for the short axis view. Subsequently, realistic images are generated from these labels, with modality-specific features that are learned from real CMR image data. We demonstrate that style transfer from another cardiac image can be accomplished by using a style encoder network. Due to the flexibility of XCAT in creating new heart models, this approach can result in a realistic virtual population to address different challenges the medical image analysis research community is facing such as expensive data collection. Our proposed method has a great potential to synthesize 4D controllable CMR images with annotations and adaptable styles to be used in various supervised multi-site, multi-vendor applications in medical image analysis.

IVJul 27, 2019
Deep learning-based prediction of kinetic parameters from myocardial perfusion MRI

Cian M. Scannell, Piet van den Bosch, Amedeo Chiribiri et al.

The quantification of myocardial perfusion MRI has the potential to provide a fast, automated and user-independent assessment of myocardial ischaemia. However, due to the relatively high noise level and low temporal resolution of the acquired data and the complexity of the tracer-kinetic models, the model fitting can yield unreliable parameter estimates. A solution to this problem is the use of Bayesian inference which can incorporate prior knowledge and improve the reliability of the parameter estimation. This, however, uses Markov chain Monte Carlo sampling to approximate the posterior distribution of the kinetic parameters which is extremely time intensive. This work proposes training convolutional networks to directly predict the kinetic parameters from the signal-intensity curves that are trained using estimates obtained from the Bayesian inference. This allows fast estimation of the kinetic parameters with a similar performance to the Bayesian inference.

IVJun 6, 2019
Hierarchical Bayesian myocardial perfusion quantification

Cian M. Scannell, Amedeo Chiribiri, Adriana D. M. Villa et al.

Purpose: Tracer-kinetic models can be used for the quantitative assessment of contrast-enhanced MRI data. However, the model-fitting can produce unreliable results due to the limited data acquired and the high noise levels. Such problems are especially prevalent in myocardial perfusion MRI leading to the compromise of constrained numerical deconvolutions and segmental signal averaging being commonly used as alternatives to the more complex tracer-kinetic models. Methods: In this work, the use of hierarchical Bayesian inference for the parameter estimation is explored. It is shown that with Bayesian inference it is possible to reliably fit the two-compartment exchange model to perfusion data. The use of prior knowledge on the ranges of kinetic parameters and the fact that neighbouring voxels are likely to have similar kinetic properties combined with a Markov chain Monte Carlo based fitting procedure significantly improves the reliability of the perfusion estimates with compared to the traditional least-squares approach. The method is assessed using both simulated and patient data. Results: The average (standard deviation) normalised mean square error for the distinct noise realisations of a simulation phantom falls from 0.32 (0.55) with the least-squares fitting to 0.13 (0.2) using Bayesian inference. The assessment of the presence of coronary artery disease based purely on the quantitative MBF maps obtained using Bayesian inference matches the visual assessment in all 24 slices. When using the maps obtained by the least-squares fitting, a corresponding assessment is only achieved in 16/24 slices. Conclusion: Bayesian inference allows a reliable, fully automated and user-independent assessment of myocardial perfusion on a voxel-wise level using the two-compartment exchange model.

NAMay 28, 2015
Invertible Orientation Scores of 3D Images

Michiel Janssen, Remco Duits, Marcel Breeuwer

The enhancement and detection of elongated structures in noisy image data is relevant for many biomedical applications. To handle complex crossing structures in 2D images, 2D orientation scores were introduced, which already showed their use in a variety of applications. Here we extend this work to 3D orientation scores. First, we construct the orientation score from a given dataset, which is achieved by an invertible coherent state type of transform. For this transformation we introduce 3D versions of the 2D cake-wavelets, which are complex wavelets that can simultaneously detect oriented structures and oriented edges. For efficient implementation of the different steps in the wavelet creation we use a spherical harmonic transform. Finally, we show some first results of practical applications of 3D orientation scores.