Meritxell Bach Cuadra

IV
h-index69
42papers
551citations
Novelty40%
AI Score53

42 Papers

IVNov 9, 2022Code
Novel structural-scale uncertainty measures and error retention curves: application to multiple sclerosis

Nataliia Molchanova, Vatsal Raina, Andrey Malinin et al.

This paper focuses on the uncertainty estimation for white matter lesions (WML) segmentation in magnetic resonance imaging (MRI). On one side, voxel-scale segmentation errors cause the erroneous delineation of the lesions; on the other side, lesion-scale detection errors lead to wrong lesion counts. Both of these factors are clinically relevant for the assessment of multiple sclerosis patients. This work aims to compare the ability of different voxel- and lesion-scale uncertainty measures to capture errors related to segmentation and lesion detection, respectively. Our main contributions are (i) proposing new measures of lesion-scale uncertainty that do not utilise voxel-scale uncertainties; (ii) extending an error retention curves analysis framework for evaluation of lesion-scale uncertainty measures. Our results obtained on the multi-center testing set of 58 patients demonstrate that the proposed lesion-scale measure achieves the best performance among the analysed measures. All code implementations are provided at https://github.com/NataliiaMolch/MS_WML_uncs

IVNov 8, 2023Code
FetMRQC: a robust quality control system for multi-centric fetal brain MRI

Thomas Sanchez, Oscar Esteban, Yvan Gomez et al.

Fetal brain MRI is becoming an increasingly relevant complement to neurosonography for perinatal diagnosis, allowing fundamental insights into fetal brain development throughout gestation. However, uncontrolled fetal motion and heterogeneity in acquisition protocols lead to data of variable quality, potentially biasing the outcome of subsequent studies. We present FetMRQC, an open-source machine-learning framework for automated image quality assessment and quality control that is robust to domain shifts induced by the heterogeneity of clinical data. FetMRQC extracts an ensemble of quality metrics from unprocessed anatomical MRI and combines them to predict experts' ratings using random forests. We validate our framework on a pioneeringly large and diverse dataset of more than 1600 manually rated fetal brain T2-weighted images from four clinical centers and 13 different scanners. Our study shows that FetMRQC's predictions generalize well to unseen data while being interpretable. FetMRQC is a step towards more robust fetal brain neuroimaging, which has the potential to shed new insights on the developing human brain.

IVApr 12, 2023Code
FetMRQC: Automated Quality Control for fetal brain MRI

Thomas Sanchez, Oscar Esteban, Yvan Gomez et al.

Quality control (QC) has long been considered essential to guarantee the reliability of neuroimaging studies. It is particularly important for fetal brain MRI, where large and unpredictable fetal motion can lead to substantial artifacts in the acquired images. Existing methods for fetal brain quality assessment operate at the \textit{slice} level, and fail to get a comprehensive picture of the quality of an image, that can only be achieved by looking at the \textit{entire} brain volume. In this work, we propose FetMRQC, a machine learning framework for automated image quality assessment tailored to fetal brain MRI, which extracts an ensemble of quality metrics that are then used to predict experts' ratings. Based on the manual ratings of more than 1000 low-resolution stacks acquired across two different institutions, we show that, compared with existing quality metrics, FetMRQC is able to generalize out-of-domain, while being interpretable and data efficient. We also release a novel manual quality rating tool designed to facilitate and optimize quality rating of fetal brain images. Our tool, along with all the code to generate, train and evaluate the model is available at https://github.com/Medical-Image-Analysis-Laboratory/fetal_brain_qc/ .

CVNov 15, 2023Code
Structural-Based Uncertainty in Deep Learning Across Anatomical Scales: Analysis in White Matter Lesion Segmentation

Nataliia Molchanova, Vatsal Raina, Andrey Malinin et al.

This paper explores uncertainty quantification (UQ) as an indicator of the trustworthiness of automated deep-learning (DL) tools in the context of white matter lesion (WML) segmentation from magnetic resonance imaging (MRI) scans of multiple sclerosis (MS) patients. Our study focuses on two principal aspects of uncertainty in structured output segmentation tasks. First, we postulate that a reliable uncertainty measure should indicate predictions likely to be incorrect with high uncertainty values. Second, we investigate the merit of quantifying uncertainty at different anatomical scales (voxel, lesion, or patient). We hypothesize that uncertainty at each scale is related to specific types of errors. Our study aims to confirm this relationship by conducting separate analyses for in-domain and out-of-domain settings. Our primary methodological contributions are (i) the development of novel measures for quantifying uncertainty at lesion and patient scales, derived from structural prediction discrepancies, and (ii) the extension of an error retention curve analysis framework to facilitate the evaluation of UQ performance at both lesion and patient scales. The results from a multi-centric MRI dataset of 444 patients demonstrate that our proposed measures more effectively capture model errors at the lesion and patient scales compared to measures that average voxel-scale uncertainty values. We provide the UQ protocols code at https://github.com/Medical-Image-Analysis-Laboratory/MS_WML_uncs.

IVFeb 10, 2023Code
Tackling Bias in the Dice Similarity Coefficient: Introducing nDSC for White Matter Lesion Segmentation

Vatsal Raina, Nataliia Molchanova, Mara Graziani et al.

The development of automatic segmentation techniques for medical imaging tasks requires assessment metrics to fairly judge and rank such approaches on benchmarks. The Dice Similarity Coefficient (DSC) is a popular choice for comparing the agreement between the predicted segmentation against a ground-truth mask. However, the DSC metric has been shown to be biased to the occurrence rate of the positive class in the ground-truth, and hence should be considered in combination with other metrics. This work describes a detailed analysis of the recently proposed normalised Dice Similarity Coefficient (nDSC) for binary segmentation tasks as an adaptation of DSC which scales the precision at a fixed recall rate to tackle this bias. White matter lesion segmentation on magnetic resonance images of multiple sclerosis patients is selected as a case study task to empirically assess the suitability of nDSC. We validate the normalised DSC using two different models across 59 subject scans with a wide range of lesion loads. It is found that the nDSC is less biased than DSC with lesion load on standard white matter lesion segmentation benchmarks measured using standard rank correlation coefficients. An implementation of nDSC is made available at: https://github.com/NataliiaMolch/nDSC .

IVApr 20, 2022
Fetal Brain Tissue Annotation and Segmentation Challenge Results

Kelly Payette, Hongwei Li, Priscille de Dumast et al.

In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, brainstem, deep grey matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the developing human brain in utero.

IVJul 8, 2024Code
Interpretability of Uncertainty: Exploring Cortical Lesion Segmentation in Multiple Sclerosis

Nataliia Molchanova, Alessandro Cagol, Pedro M. Gordaliza et al.

Uncertainty quantification (UQ) has become critical for evaluating the reliability of artificial intelligence systems, especially in medical image segmentation. This study addresses the interpretability of instance-wise uncertainty values in deep learning models for focal lesion segmentation in magnetic resonance imaging, specifically cortical lesion (CL) segmentation in multiple sclerosis. CL segmentation presents several challenges, including the complexity of manual segmentation, high variability in annotation, data scarcity, and class imbalance, all of which contribute to aleatoric and epistemic uncertainty. We explore how UQ can be used not only to assess prediction reliability but also to provide insights into model behavior, detect biases, and verify the accuracy of UQ methods. Our research demonstrates the potential of instance-wise uncertainty values to offer post hoc global model explanations, serving as a sanity check for the model. The implementation is available at https://github.com/NataliiaMolch/interpret-lesion-unc.

LGJun 30, 2022
Shifts 2.0: Extending The Dataset of Real Distributional Shifts

Andrey Malinin, Andreas Athanasopoulos, Muhamed Barakovic et al.

Distributional shift, or the mismatch between training and deployment data, is a significant obstacle to the usage of machine learning in high-stakes industrial applications, such as autonomous driving and medicine. This creates a need to be able to assess how robustly ML models generalize as well as the quality of their uncertainty estimates. Standard ML baseline datasets do not allow these properties to be assessed, as the training, validation and test data are often identically distributed. Recently, a range of dedicated benchmarks have appeared, featuring both distributionally matched and shifted data. Among these benchmarks, the Shifts dataset stands out in terms of the diversity of tasks as well as the data modalities it features. While most of the benchmarks are heavily dominated by 2D image classification tasks, Shifts contains tabular weather forecasting, machine translation, and vehicle motion prediction tasks. This enables the robustness properties of models to be assessed on a diverse set of industrial-scale tasks and either universal or directly applicable task-specific conclusions to be reached. In this paper, we extend the Shifts Dataset with two datasets sourced from industrial, high-risk applications of high societal importance. Specifically, we consider the tasks of segmentation of white matter Multiple Sclerosis lesions in 3D magnetic resonance brain images and the estimation of power consumption in marine cargo vessels. Both tasks feature ubiquitous distributional shifts and a strict safety requirement due to the high cost of errors. These new datasets will allow researchers to further explore robust generalization and uncertainty estimation in new situations. In this work, we provide a description of the dataset and baseline results for both tasks.

CVOct 18, 2022
Weakly Supervised Learning with Automated Labels from Radiology Reports for Glioma Change Detection

Tommaso Di Noto, Meritxell Bach Cuadra, Chirine Atat et al.

Gliomas are the most frequent primary brain tumors in adults. Glioma change detection aims at finding the relevant parts of the image that change over time. Although Deep Learning (DL) shows promising performances in similar change detection tasks, the creation of large annotated datasets represents a major bottleneck for supervised DL applications in radiology. To overcome this, we propose a combined use of weak labels (imprecise, but fast-to-create annotations) and Transfer Learning (TL). Specifically, we explore inductive TL, where source and target domains are identical, but tasks are different due to a label shift: our target labels are created manually by three radiologists, whereas our source weak labels are generated automatically from radiology reports via NLP. We frame knowledge transfer as hyperparameter optimization, thus avoiding heuristic choices that are frequent in related works. We investigate the relationship between model size and TL, comparing a low-capacity VGG with a higher-capacity ResNeXt model. We evaluate our models on 1693 T2-weighted magnetic resonance imaging difference maps created from 183 patients, by classifying them into stable or unstable according to tumor evolution. The weak labels extracted from radiology reports allowed us to increase dataset size more than 3-fold, and improve VGG classification results from 75% to 82% AUC. Mixed training from scratch led to higher performance than fine-tuning or feature extraction. To assess generalizability, we ran inference on an open dataset (BraTS-2015: 15 patients, 51 difference maps), reaching up to 76% AUC. Overall, results suggest that medical imaging problems may benefit from smaller models and different TL strategies with respect to computer vision datasets, and that report-generated weak labels are effective in improving model performances. Code, in-house dataset and BraTS labels are released.

IVNov 11, 2022
Self-Supervised Isotropic Superresolution Fetal Brain MRI

Kay Lächler, Hélène Lajous, Michael Unser et al.

Superresolution T2-weighted fetal-brain magnetic-resonance imaging (FBMRI) traditionally relies on the availability of several orthogonal low-resolution series of 2-dimensional thick slices (volumes). In practice, only a few low-resolution volumes are acquired. Thus, optimization-based image-reconstruction methods require strong regularization using hand-crafted regularizers (e.g., TV). Yet, due to in utero fetal motion and the rapidly changing fetal brain anatomy, the acquisition of the high-resolution images that are required to train supervised learning methods is difficult. In this paper, we sidestep this difficulty by providing a proof of concept of a self-supervised single-volume superresolution framework for T2-weighted FBMRI (SAIR). We validate SAIR quantitatively in a motion-free simulated environment. Our results for different noise levels and resolution ratios suggest that SAIR is comparable to multiple-volume superresolution reconstruction methods. We also evaluate SAIR qualitatively on clinical FBMRI data. The results suggest SAIR could be incorporated into current reconstruction pipelines.

IVJul 5, 2023
Direct segmentation of brain white matter tracts in diffusion MRI

Hamza Kebiri, Ali Gholipour, Meritxell Bach Cuadra et al.

The brain white matter consists of a set of tracts that connect distinct regions of the brain. Segmentation of these tracts is often needed for clinical and research studies. Diffusion-weighted MRI offers unique contrast to delineate these tracts. However, existing segmentation methods rely on intermediate computations such as tractography or estimation of fiber orientation density. These intermediate computations, in turn, entail complex computations that can result in unnecessary errors. Moreover, these intermediate computations often require dense multi-shell measurements that are unavailable in many clinical and research applications. As a result, current methods suffer from low accuracy and poor generalizability. Here, we propose a new deep learning method that segments these tracts directly from the diffusion MRI data, thereby sidestepping the intermediate computation errors. Our experiments show that this method can achieve segmentation accuracy that is on par with the state of the art methods (mean Dice Similarity Coefficient of 0.826). Compared with the state of the art, our method offers far superior generalizability to undersampled data that are typical of clinical studies and to data obtained with different acquisition protocols. Moreover, we propose a new method for detecting inaccurate segmentations and show that it is more accurate than standard methods that are based on estimation uncertainty quantification. The new methods can serve many critically important clinical and scientific applications that require accurate and reliable non-invasive segmentation of white matter tracts.

CVApr 13
Towards Brain MRI Foundation Models for the Clinic: Findings from the FOMO25 Challenge

Asbjørn Munk, Stefano Cerri, Vardan Nersesjan et al.

Clinical deployment of automated brain MRI analysis faces a fundamental challenge: clinical data is heterogeneous and noisy, and high-quality labels are prohibitively costly to obtain. Self-supervised learning (SSL) can address this by leveraging the vast amounts of unlabeled data produced in clinical workflows to train robust \textit{foundation models} that adapt out-of-domain with minimal supervision. However, the development of foundation models for brain MRI has been limited by small pretraining datasets and in-domain benchmarking focused on high-quality, research-grade data. To address this gap, we organized the FOMO25 challenge as a satellite event at MICCAI 2025. FOMO25 provided participants with a large pretraining dataset, FOMO60K, and evaluated models on data sourced directly from clinical workflows in few-shot and out-of-domain settings. Tasks covered infarct classification, meningioma segmentation, and brain age regression, and considered both models trained on FOMO60K (method track) and any data (open track). Nineteen foundation models from sixteen teams were evaluated using a standardized containerized pipeline. Results show that (a) self-supervised pretraining improves generalization on clinical data under domain shift, with the strongest models trained \textit{out-of-domain} surpassing supervised baselines trained \textit{in-domain}. (b) No single pretraining objective benefits all tasks: MAE favors segmentation, hybrid reconstruction-contrastive objectives favor classification, and (c) strong performance was achieved by small pretrained models, and improvements from scaling model size and training duration did not yield reliable benefits.

IVAug 21, 2024
Exploiting XAI maps to improve MS lesion segmentation and detection in MRI

Federico Spagnolo, Nataliia Molchanova, Mario Ocampo Pineda et al.

To date, several methods have been developed to explain deep learning algorithms for classification tasks. Recently, an adaptation of two of such methods has been proposed to generate instance-level explainable maps in a semantic segmentation scenario, such as multiple sclerosis (MS) lesion segmentation. In the mentioned work, a 3D U-Net was trained and tested for MS lesion segmentation, yielding an F1 score of 0.7006, and a positive predictive value (PPV) of 0.6265. The distribution of values in explainable maps exposed some differences between maps of true and false positive (TP/FP) examples. Inspired by those results, we explore in this paper the use of characteristics of lesion-specific saliency maps to refine segmentation and detection scores. We generate around 21000 maps from as many TP/FP lesions in a batch of 72 patients (training set) and 4868 from the 37 patients in the test set. 93 radiomic features extracted from the first set of maps were used to train a logistic regression model and classify TP versus FP. On the test set, F1 score and PPV were improved by a large margin when compared to the initial model, reaching 0.7450 and 0.7817, with 95% confidence intervals of [0.7358, 0.7547] and [0.7679, 0.7962], respectively. These results suggest that saliency maps can be used to refine prediction scores, boosting a model's performances.

IVSep 2, 2024
Ground-truth effects in learning-based fiber orientation distribution estimation in neonatal brains

Rizhong Lin, Hamza Kebiri, Ali Gholipour et al.

Diffusion Magnetic Resonance Imaging (dMRI) is a non-invasive method for depicting brain microstructure in vivo. Fiber orientation distributions (FODs) are mathematical representations extensively used to map white matter fiber configurations. Recently, FOD estimation with deep neural networks has seen growing success, in particular, those of neonates estimated with fewer diffusion measurements. These methods are mostly trained on target FODs reconstructed with multi-shell multi-tissue constrained spherical deconvolution (MSMT-CSD), which might not be the ideal ground truth for developing brains. Here, we investigate this hypothesis by training a state-of-the-art model based on the U-Net architecture on both MSMT-CSD and single-shell three-tissue constrained spherical deconvolution (SS3T-CSD). Our results suggest that SS3T-CSD might be more suited for neonatal brains, given that the ratio between single and multiple fiber-estimated voxels with SS3T-CSD is more realistic compared to MSMT-CSD. Additionally, increasing the number of input gradient directions significantly improves performance with SS3T-CSD over MSMT-CSD. Finally, in an age domain-shift setting, SS3T-CSD maintains robust performance across age groups, indicating its potential for more accurate neonatal brain imaging.

IVNov 25, 2022
Domain generalization in fetal brain MRI segmentation \\with multi-reconstruction augmentation

Priscille de Dumast, Meritxell Bach Cuadra

Quantitative analysis of in utero human brain development is crucial for abnormal characterization. Magnetic resonance image (MRI) segmentation is therefore an asset for quantitative analysis. However, the development of automated segmentation methods is hampered by the scarce availability of fetal brain MRI annotated datasets and the limited variability within these cohorts. In this context, we propose to leverage the power of fetal brain MRI super-resolution (SR) reconstruction methods to generate multiple reconstructions of a single subject with different parameters, thus as an efficient tuning-free data augmentation strategy. Overall, the latter significantly improves the generalization of segmentation methods over SR pipelines.

IVDec 9, 2025
Causal Attribution of Model Performance Gaps in Medical Imaging Under Distribution Shifts

Pedro M. Gordaliza, Nataliia Molchanova, Jaume Banus et al.

Deep learning models for medical image segmentation suffer significant performance drops due to distribution shifts, but the causal mechanisms behind these drops remain poorly understood. We extend causal attribution frameworks to high-dimensional segmentation tasks, quantifying how acquisition protocols and annotation variability independently contribute to performance degradation. We model the data-generating process through a causal graph and employ Shapley values to fairly attribute performance changes to individual mechanisms. Our framework addresses unique challenges in medical imaging: high-dimensional outputs, limited samples, and complex mechanism interactions. Validation on multiple sclerosis (MS) lesion segmentation across 4 centers and 7 annotators reveals context-dependent failure modes: annotation protocol shifts dominate when crossing annotators (7.4% $\pm$ 8.9% DSC attribution), while acquisition shifts dominate when crossing imaging centers (6.5% $\pm$ 9.1%). This mechanism-specific quantification enables practitioners to prioritize targeted interventions based on deployment context.

LGNov 4, 2025
Accounting for Underspecification in Statistical Claims of Model Superiority

Thomas Sanchez, Pedro M. Gordaliza, Meritxell Bach Cuadra

Machine learning methods are increasingly applied in medical imaging, yet many reported improvements lack statistical robustness: recent works have highlighted that small but significant performance gains are highly likely to be false positives. However, these analyses do not take \emph{underspecification} into account -- the fact that models achieving similar validation scores may behave differently on unseen data due to random initialization or training dynamics. Here, we extend a recent statistical framework modeling false outperformance claims to include underspecification as an additional variance component. Our simulations demonstrate that even modest seed variability ($\sim1\%$) substantially increases the evidence required to support superiority claims. Our findings underscore the need for explicit modeling of training variance when validating medical imaging systems.

CVJan 19Code
From 100,000+ images to winning the first brain MRI foundation model challenges: Sharing lessons and models

Pedro M. Gordaliza, Jaume Banus, Benoît Gérin et al.

Developing Foundation Models for medical image analysis is essential to overcome the unique challenges of radiological tasks. The first challenges of this kind for 3D brain MRI, SSL3D and FOMO25, were held at MICCAI 2025. Our solution ranked first in tracks of both contests. It relies on a U-Net CNN architecture combined with strategies leveraging anatomical priors and neuroimaging domain knowledge. Notably, our models trained 1-2 orders of magnitude faster and were 10 times smaller than competing transformer-based approaches. Models are available here: https://github.com/jbanusco/BrainFM4Challenges.

IVMar 13, 2025Code
Automatic quality control in multi-centric fetal brain MRI super-resolution reconstruction

Thomas Sanchez, Vladyslav Zalevskyi, Angeline Mihailov et al.

Quality control (QC) has long been considered essential to guarantee the reliability of neuroimaging studies. It is particularly important for fetal brain MRI, where acquisitions and image processing techniques are less standardized than in adult imaging. In this work, we focus on automated quality control of super-resolution reconstruction (SRR) volumes of fetal brain MRI, an important processing step where multiple stacks of thick 2D slices are registered together and combined to build a single, isotropic and artifact-free T2 weighted volume. We propose FetMRQC$_{SR}$, a machine-learning method that extracts more than 100 image quality metrics to predict image quality scores using a random forest model. This approach is well suited to a problem that is high dimensional, with highly heterogeneous data and small datasets. We validate FetMRQC$_{SR}$ in an out-of-domain (OOD) setting and report high performance (ROC AUC = 0.89), even when faced with data from an unknown site or SRR method. We also investigate failure cases and show that they occur in $45\%$ of the images due to ambiguous configurations for which the rating from the expert is arguable. These results are encouraging and illustrate how a non deep learning-based method like FetMRQC$_{SR}$ is well suited to this multifaceted problem. Our tool, along with all the code used to generate, train and evaluate the model are available at https://github.com/Medical-Image-Analysis-Laboratory/fetmrqc_sr/ .

MED-PHSep 6, 2021Code
FaBiAN: A Fetal Brain magnetic resonance Acquisition Numerical phantom

Hélène Lajous, Christopher W. Roy, Tom Hilbert et al.

Accurate characterization of in utero human brain maturation is critical as it involves complex and interconnected structural and functional processes that may influence health later in life. Magnetic resonance imaging is a powerful tool to investigate equivocal neurological patterns during fetal development. However, the number of acquisitions of satisfactory quality available in this cohort of sensitive subjects remains scarce, thus hindering the validation of advanced image processing techniques. Numerical phantoms can mitigate these limitations by providing a controlled environment with a known ground truth. In this work, we present FaBiAN, an open-source Fetal Brain magnetic resonance Acquisition Numerical phantom that simulates clinical T2-weighted fast spin echo sequences of the fetal brain. This unique tool is based on a general, flexible and realistic setup that includes stochastic fetal movements, thus providing images of the fetal brain throughout maturation comparable to clinical acquisitions. We demonstrate its value to evaluate the robustness and optimize the accuracy of an algorithm for super-resolution fetal brain magnetic resonance imaging from simulated motion-corrupted 2D low-resolution series as compared to a synthetic high-resolution reference volume. We also show that the images generated can complement clinical datasets to support data-intensive deep learning methods for fetal brain tissue segmentation.

IVMar 22, 2024
Improving cross-domain brain tissue segmentation in fetal MRI with synthetic data

Vladyslav Zalevskyi, Thomas Sanchez, Margaux Roulet et al.

Segmentation of fetal brain tissue from magnetic resonance imaging (MRI) plays a crucial role in the study of in utero neurodevelopment. However, automated tools face substantial domain shift challenges as they must be robust to highly heterogeneous clinical data, often limited in numbers and lacking annotations. Indeed, high variability of the fetal brain morphology, MRI acquisition parameters, and superresolution reconstruction (SR) algorithms adversely affect the model's performance when evaluated out-of-domain. In this work, we introduce FetalSynthSeg, a domain randomization method to segment fetal brain MRI, inspired by SynthSeg. Our results show that models trained solely on synthetic data outperform models trained on real data in out-ofdomain settings, validated on a 120-subject cross-domain dataset. Furthermore, we extend our evaluation to 40 subjects acquired using lowfield (0.55T) MRI and reconstructed with novel SR models, showcasing robustness across different magnetic field strengths and SR algorithms. Leveraging a generative synthetic approach, we tackle the domain shift problem in fetal brain MRI and offer compelling prospects for applications in fields with limited and highly heterogeneous data.

IVDec 22, 2023
Cross-Age and Cross-Site Domain Shift Impacts on Deep Learning-Based White Matter Fiber Estimation in Newborn and Baby Brains

Rizhong Lin, Ali Gholipour, Jean-Philippe Thiran et al.

Deep learning models have shown great promise in estimating tissue microstructure from limited diffusion magnetic resonance imaging data. However, these models face domain shift challenges when test and train data are from different scanners and protocols, or when the models are applied to data with inherent variations such as the developing brains of infants and children scanned at various ages. Several techniques have been proposed to address some of these challenges, such as data harmonization or domain adaptation in the adult brain. However, those techniques remain unexplored for the estimation of fiber orientation distribution functions in the rapidly developing brains of infants. In this work, we extensively investigate the age effect and domain shift within and across two different cohorts of 201 newborns and 165 babies using the Method of Moments and fine-tuning strategies. Our results show that reduced variations in the microstructural development of babies in comparison to newborns directly impact the deep learning models' cross-age performance. We also demonstrate that a small number of target domain samples can significantly mitigate domain shift problems.

CVMay 5, 2025
Advances in Automated Fetal Brain MRI Segmentation and Biometry: Insights from the FeTA 2024 Challenge

Vladyslav Zalevskyi, Thomas Sanchez, Misha Kaandorp et al.

Accurate fetal brain tissue segmentation and biometric analysis are essential for studying brain development in utero. The FeTA Challenge 2024 advanced automated fetal brain MRI analysis by introducing biometry prediction as a new task alongside tissue segmentation. For the first time, our diverse multi-centric test set included data from a new low-field (0.55T) MRI dataset. Evaluation metrics were also expanded to include the topology-specific Euler characteristic difference (ED). Sixteen teams submitted segmentation methods, most of which performed consistently across both high- and low-field scans. However, longitudinal trends indicate that segmentation accuracy may be reaching a plateau, with results now approaching inter-rater variability. The ED metric uncovered topological differences that were missed by conventional metrics, while the low-field dataset achieved the highest segmentation scores, highlighting the potential of affordable imaging systems when paired with high-quality reconstruction. Seven teams participated in the biometry task, but most methods failed to outperform a simple baseline that predicted measurements based solely on gestational age, underscoring the challenge of extracting reliable biometric estimates from image data alone. Domain shift analysis identified image quality as the most significant factor affecting model generalization, with super-resolution pipelines also playing a substantial role. Other factors, such as gestational age, pathology, and acquisition site, had smaller, though still measurable, effects. Overall, FeTA 2024 offers a comprehensive benchmark for multi-class segmentation and biometry estimation in fetal brain MRI, underscoring the need for data-centric approaches, improved topological evaluation, and greater dataset diversity to enable clinically robust and generalizable AI tools.

IVNov 11, 2024
DRIFTS: Optimizing Domain Randomization with Synthetic Data and Weight Interpolation for Fetal Brain Tissue Segmentation

Vladyslav Zalevskyi, Thomas Sanchez, Margaux Roulet et al.

Fetal brain tissue segmentation in magnetic resonance imaging (MRI) is a crucial tool that supports understanding of neurodevelopment, yet it faces challenges due to the heterogeneity of data coming from different scanners and settings, as well as data scarcity. Recent approaches based on domain randomization, like SynthSeg, have shown great potential for single-source domain generalization by simulating images with randomized contrast and image resolution from the label maps. In this work, we investigate how to maximize the out-of-domain (OOD) generalization potential of SynthSegbased methods in fetal brain MRI. Specifically, we demonstrate that the simple Gaussian mixture models employed in FetalSynthSeg outperform physics-informed generation methods in terms of OOD generalization. We further show that incorporating intensity clustering significantly enhances generalization in settings with limited label classes by producing more realistic synthetic data. By combining synthetic pretraining with fine-tuning on real images and applying weight-space interpolation between the two models, we propose DRIFTS as an effective and practical solution for single-source domain generalization. DRIFTS consistently outperforms current state-of-the-art models across multiple benchmarks and is, to our knowledge, the first method to achieve accurate brain tissue segmentation on fetal T1-weighted images. We validate our approach on 308 subjects from four datasets acquired at three different sites, covering a range of scanner field strengths (0.55T to 3T) and both T1w and T2w modalities. We conclude with five practical recommendations to guide the development of SynthSeg-based methods for other organs and imaging modalities.

IVMay 14, 2025
Meta-learning Slice-to-Volume Reconstruction in Fetal Brain MRI using Implicit Neural Representations

Maik Dannecker, Thomas Sanchez, Meritxell Bach Cuadra et al.

High-resolution slice-to-volume reconstruction (SVR) from multiple motion-corrupted low-resolution 2D slices constitutes a critical step in image-based diagnostics of moving subjects, such as fetal brain Magnetic Resonance Imaging (MRI). Existing solutions struggle with image artifacts and severe subject motion or require slice pre-alignment to achieve satisfying reconstruction performance. We propose a novel SVR method to enable fast and accurate MRI reconstruction even in cases of severe image and motion corruption. Our approach performs motion correction, outlier handling, and super-resolution reconstruction with all operations being entirely based on implicit neural representations. The model can be initialized with task-specific priors through fully self-supervised meta-learning on either simulated or real-world data. In extensive experiments including over 480 reconstructions of simulated and clinical MRI brain data from different centers, we prove the utility of our method in cases of severe subject motion and image artifacts. Our results demonstrate improvements in reconstruction quality, especially in the presence of severe motion, compared to state-of-the-art methods, and up to 50% reduction in reconstruction time.

IVApr 14, 2025
Towards contrast- and pathology-agnostic clinical fetal brain MRI segmentation using SynthSeg

Ziyao Shang, Misha Kaandorp, Kelly Payette et al.

Magnetic resonance imaging (MRI) has played a crucial role in fetal neurodevelopmental research. Structural annotations of MR images are an important step for quantitative analysis of the developing human brain, with Deep learning providing an automated alternative for this otherwise tedious manual process. However, segmentation performances of Convolutional Neural Networks often suffer from domain shift, where the network fails when applied to subjects that deviate from the distribution with which it is trained on. In this work, we aim to train networks capable of automatically segmenting fetal brain MRIs with a wide range of domain shifts pertaining to differences in subject physiology and acquisition environments, in particular shape-based differences commonly observed in pathological cases. We introduce a novel data-driven train-time sampling strategy that seeks to fully exploit the diversity of a given training dataset to enhance the domain generalizability of the trained networks. We adapted our sampler, together with other existing data augmentation techniques, to the SynthSeg framework, a generator that utilizes domain randomization to generate diverse training data, and ran thorough experimentations and ablation studies on a wide range of training/testing data to test the validity of the approaches. Our networks achieved notable improvements in the segmentation quality on testing subjects with intense anatomical abnormalities (p < 1e-4), though at the cost of a slighter decrease in performance in cases with fewer abnormalities. Our work also lays the foundation for future works on creating and adapting data-driven sampling strategies for other training pipelines.

CVAug 28, 2025
Enhancing Corpus Callosum Segmentation in Fetal MRI via Pathology-Informed Domain Randomization

Marina Grifell i Plana, Vladyslav Zalevskyi, Léa Schmidt et al.

Accurate fetal brain segmentation is crucial for extracting biomarkers and assessing neurodevelopment, especially in conditions such as corpus callosum dysgenesis (CCD), which can induce drastic anatomical changes. However, the rarity of CCD severely limits annotated data, hindering the generalization of deep learning models. To address this, we propose a pathology-informed domain randomization strategy that embeds prior knowledge of CCD manifestations into a synthetic data generation pipeline. By simulating diverse brain alterations from healthy data alone, our approach enables robust segmentation without requiring pathological annotations. We validate our method on a cohort comprising 248 healthy fetuses, 26 with CCD, and 47 with other brain pathologies, achieving substantial improvements on CCD cases while maintaining performance on both healthy fetuses and those with other pathologies. From the predicted segmentations, we derive clinically relevant biomarkers, such as corpus callosum length (LCC) and volume, and show their utility in distinguishing CCD subtypes. Our pathology-informed augmentation reduces the LCC estimation error from 1.89 mm to 0.80 mm in healthy cases and from 10.9 mm to 0.7 mm in CCD cases. Beyond these quantitative gains, our approach yields segmentations with improved topological consistency relative to available ground truth, enabling more reliable shape-based analyses. Overall, this work demonstrates that incorporating domain-specific anatomical priors into synthetic data pipelines can effectively mitigate data scarcity and enhance analysis of rare but clinically significant malformations.

CVAug 14, 2025
Physics-Informed Joint Multi-TE Super-Resolution with Implicit Neural Representation for Robust Fetal T2 Mapping

Busra Bulut, Maik Dannecker, Thomas Sanchez et al.

T2 mapping in fetal brain MRI has the potential to improve characterization of the developing brain, especially at mid-field (0.55T), where T2 decay is slower. However, this is challenging as fetal MRI acquisition relies on multiple motion-corrupted stacks of thick slices, requiring slice-to-volume reconstruction (SVR) to estimate a high-resolution (HR) 3D volume. Currently, T2 mapping involves repeated acquisitions of these stacks at each echo time (TE), leading to long scan times and high sensitivity to motion. We tackle this challenge with a method that jointly reconstructs data across TEs, addressing severe motion. Our approach combines implicit neural representations with a physics-informed regularization that models T2 decay, enabling information sharing across TEs while preserving anatomical and quantitative T2 fidelity. We demonstrate state-of-the-art performance on simulated fetal brain and in vivo adult datasets with fetal-like motion. We also present the first in vivo fetal T2 mapping results at 0.55T. Our study shows potential for reducing the number of stacks per TE in T2 mapping by leveraging anatomical redundancy.

IVJul 16, 2025
Benchmarking and Explaining Deep Learning Cortical Lesion MRI Segmentation in Multiple Sclerosis

Nataliia Molchanova, Alessandro Cagol, Mario Ocampo-Pineda et al.

Cortical lesions (CLs) have emerged as valuable biomarkers in multiple sclerosis (MS), offering high diagnostic specificity and prognostic relevance. However, their routine clinical integration remains limited due to subtle magnetic resonance imaging (MRI) appearance, challenges in expert annotation, and a lack of standardized automated methods. We propose a comprehensive multi-centric benchmark of CL detection and segmentation in MRI. A total of 656 MRI scans, including clinical trial and research data from four institutions, were acquired at 3T and 7T using MP2RAGE and MPRAGE sequences with expert-consensus annotations. We rely on the self-configuring nnU-Net framework, designed for medical imaging segmentation, and propose adaptations tailored to the improved CL detection. We evaluated model generalization through out-of-distribution testing, demonstrating strong lesion detection capabilities with an F1-score of 0.64 and 0.5 in and out of the domain, respectively. We also analyze internal model features and model errors for a better understanding of AI decision-making. Our study examines how data variability, lesion ambiguity, and protocol differences impact model performance, offering future recommendations to address these barriers to clinical adoption. To reinforce the reproducibility, the implementation and models will be publicly accessible and ready to use at https://github.com/Medical-Image-Analysis-Laboratory/ and https://doi.org/10.5281/zenodo.15911797.

IVApr 7, 2025
Explaining Uncertainty in Multiple Sclerosis Lesion Segmentation Beyond Prediction Errors

Nataliia Molchanova, Pedro M. Gordaliza, Alessandro Cagol et al.

Trustworthy artificial intelligence (AI) is essential in healthcare, particularly for high-stakes tasks like medical image segmentation. Explainable AI and uncertainty quantification significantly enhance AI reliability by addressing key attributes such as robustness, usability, and explainability. Despite extensive technical advances in uncertainty quantification for medical imaging, understanding the clinical informativeness and interpretability of uncertainty remains limited. This study introduces a novel framework to explain the potential sources of predictive uncertainty, specifically in cortical lesion segmentation in multiple sclerosis using deep ensembles. The proposed analysis shifts the focus from the uncertainty-error relationship towards relevant medical and engineering factors. Our findings reveal that instance-wise uncertainty is strongly related to lesion size, shape, and cortical involvement. Expert rater feedback confirms that similar factors impede annotator confidence. Evaluations conducted on two datasets (206 patients, almost 2000 lesions) under both in-domain and distribution-shift conditions highlight the utility of the framework in different scenarios.

IVMar 22, 2025
Assessing workflow impact and clinical utility of AI-assisted brain aneurysm detection: a multi-reader study

Tommaso Di Noto, Sofyan Jankowski, Francesco Puccinelli et al.

Despite the plethora of AI-based algorithms developed for anomaly detection in radiology, subsequent integration into clinical setting is rarely evaluated. In this work, we assess the applicability and utility of an AI-based model for brain aneurysm detection comparing the performance of two readers with different levels of experience (2 and 13 years). We aim to answer the following questions: 1) Do the readers improve their performance when assisted by the AI algorithm? 2) How much does the AI algorithm impact routine clinical workflow? We reuse and enlarge our open-access, Time-Of-Flight Magnetic Resonance Angiography dataset (N=460). We use 360 subjects for training/validating our algorithm and 100 as unseen test set for the reading session. Even though our model reaches state-of-the-art results on the test set (sensitivity=74%, false positive rate=1.6), we show that neither the junior nor the senior reader significantly increase their sensitivity (p=0.59, p=1, respectively). In addition, we find that reading time for both readers is significantly higher in the "AI-assisted" setting than in the "Unassisted" (+15 seconds, on average; p=3x10^(-4) junior, p=3x10^(-5) senior). The confidence reported by the readers is unchanged across the two settings, indicating that the AI assistance does not influence the certainty of the diagnosis. Our findings highlight the importance of clinical validation of AI algorithms in a clinical setting involving radiologists. This study should serve as a reminder to the community to always examine the real-word effectiveness and workflow impact of proposed algorithms.

IVJun 13, 2024
Instance-level quantitative saliency in multiple sclerosis lesion segmentation

Federico Spagnolo, Nataliia Molchanova, Meritxell Bach Cuadra et al.

Explainable artificial intelligence (XAI) methods have been proposed to interpret model decisions in classification and, more recently, in semantic segmentation. However, instance-level XAI for semantic segmentation, namely explanations focused on a single object among multiple instances of the same class, remains largely unexplored. Such explanations are particularly important in multi-lesional diseases to understand what drives the detection and contouring of a specific lesion. We propose instance-level explanation maps for semantic segmentation by extending SmoothGrad and Grad-CAM++ to obtain quantitative instance saliency. These methods were applied to the segmentation of white matter lesions (WMLs), a magnetic resonance imaging biomarker in multiple sclerosis. We used 4023 FLAIR and MPRAGE MRI scans from 687 patients collected at the University Hospital of Basel, Switzerland, with WML masks annotated by four expert clinicians. Three deep learning architectures, a 3D U-Net, nnU-Net, and Swin UNETR, were trained and evaluated, achieving normalized Dice scores of 0.71, 0.78, and 0.80, respectively. Instance saliency maps showed that the models relied primarily on FLAIR rather than MPRAGE for WML segmentation, with positive saliency inside lesions and negative saliency in their immediate neighborhood, consistent with clinical practice. Peak saliency values differed significantly across correct and incorrect predictions, suggesting that quantitative instance saliency may help identify segmentation errors. In conclusion, we introduce two architecture-agnostic XAI methods that provide quantitative instance-level explanations for semantic segmentation and support clinically meaningful interpretation of model decisions.

IVJan 29, 2022
Validation and Generalizability of Self-Supervised Image Reconstruction Methods for Undersampled MRI

Thomas Yu, Tom Hilbert, Gian Franco Piredda et al.

Deep learning methods have become the state of the art for undersampled MR reconstruction. Particularly for cases where it is infeasible or impossible for ground truth, fully sampled data to be acquired, self-supervised machine learning methods for reconstruction are becoming increasingly used. However potential issues in the validation of such methods, as well as their generalizability, remain underexplored. In this paper, we investigate important aspects of the validation of self-supervised algorithms for reconstruction of undersampled MR images: quantitative evaluation of prospective reconstructions, potential differences between prospective and retrospective reconstructions, suitability of commonly used quantitative metrics, and generalizability. Two self-supervised algorithms based on self-supervised denoising and the deep image prior were investigated. These methods are compared to a least squares fitting and a compressed sensing reconstruction using in-vivo and phantom data. Their generalizability was tested with prospectively under-sampled data from experimental conditions different to the training. We show that prospective reconstructions can exhibit significant distortion relative to retrospective reconstructions/ground truth. Furthermore, pixel-wise quantitative metrics may not capture differences in perceptual quality accurately, in contrast to a perceptual metric. In addition, all methods showed potential for generalization; however, generalizability is more affected by changes in anatomy/contrast than other changes. We further showed that no-reference image metrics correspond well with human rating of image quality for studying generalizability. Finally, we showed that a well-tuned compressed sensing reconstruction and learned denoising perform similarly on all data.

IVJan 19, 2022
Cortical lesions, central vein sign, and paramagnetic rim lesions in multiple sclerosis: emerging machine learning techniques and future avenues

Francesco La Rosa, Maxence Wynen, Omar Al-Louzi et al.

The current multiple sclerosis (MS) diagnostic criteria lack specificity, and this may lead to misdiagnosis, which remains an issue in present-day clinical practice. In addition, conventional biomarkers only moderately correlate with MS disease progression. Recently, advanced MS lesional imaging biomarkers such as cortical lesions (CL), the central vein sign (CVS), and paramagnetic rim lesions (PRL), visible in specialized magnetic resonance imaging (MRI) sequences, have shown higher specificity in differential diagnosis. Moreover, studies have shown that CL and PRL are potential prognostic biomarkers, the former correlating with cognitive impairments and the latter with early disability progression. As machine learning-based methods have achieved extraordinary performance in the assessment of conventional imaging biomarkers, such as white matter lesion segmentation, several automated or semi-automated methods have been proposed for CL, CVS, and PRL as well. In the present review, we first introduce these advanced MS imaging biomarkers and their imaging methods. Subsequently, we describe the corresponding machine learning-based methods that were used to tackle these clinical questions, putting them into context with respect to the challenges they are still facing, including non-standardized MRI protocols, limited datasets, and moderate inter-rater variability. We conclude by presenting the current limitations that prevent their broader deployment and suggesting future research directions.

IVNov 22, 2021
4D iterative reconstruction of brain fMRI in the moving fetus

Athena Taymourtash, Hamza Kebiri, Sébastien Tourbier et al.

Resting-state functional Magnetic Resonance Imaging (fMRI) is a powerful imaging technique for studying functional development of the brain in utero. However, unpredictable and excessive movement of fetuses has limited clinical application since it causes substantial signal fluctuations which can systematically alter observed patterns of functional connectivity. Previous studies have focused on the accurate estimation of the motion parameters in case of large fetal head movement and used a 3D single step interpolation approach at each timepoint to recover motion-free fMRI images. This does not guarantee that the reconstructed image corresponds to the minimum error representation of fMRI time series given the acquired data. Here, we propose a novel technique based on four dimensional iterative reconstruction of the scattered slices acquired during fetal fMRI. The accuracy of the proposed method was quantitatively evaluated on a group of real clinical fMRI fetuses. The results indicate improvements of reconstruction quality compared to the conventional 3D interpolation approach.

IVNov 8, 2021
Synthetic magnetic resonance images for domain adaptation: Application to fetal brain tissue segmentation

Priscille de Dumast, Hamza Kebiri, Kelly Payette et al.

The quantitative assessment of the developing human brain in utero is crucial to fully understand neurodevelopment. Thus, automated multi-tissue fetal brain segmentation algorithms are being developed, which in turn require annotated data to be trained. However, the available annotated fetal brain datasets are limited in number and heterogeneity, hampering domain adaptation strategies for robust segmentation. In this context, we use FaBiAN, a Fetal Brain magnetic resonance Acquisition Numerical phantom, to simulate various realistic magnetic resonance images of the fetal brain along with its class labels. We demonstrate that these multiple synthetic annotated data, generated at no cost and further reconstructed using the target super-resolution technique, can be successfully used for domain adaptation of a deep learning method that segments seven brain tissues. Overall, the accuracy of the segmentation is significantly enhanced, especially in the cortical gray matter, the white matter, the cerebellum, the deep gray matter and the brain stem.

IVMar 10, 2021
Towards automated brain aneurysm detection in TOF-MRA: open data, weak labels, and anatomical knowledge

Tommaso Di Noto, Guillaume Marie, Sebastien Tourbier et al.

Brain aneurysm detection in Time-Of-Flight Magnetic Resonance Angiography (TOF-MRA) has undergone drastic improvements with the advent of Deep Learning (DL). However, performances of supervised DL models heavily rely on the quantity of labeled samples, which are extremely costly to obtain. Here, we present a DL model for aneurysm detection that overcomes the issue with ''weak'' labels: oversized annotations which are considerably faster to create. Our weak labels resulted to be four times faster to generate than their voxel-wise counterparts. In addition, our model leverages prior anatomical knowledge by focusing only on plausible locations for aneurysm occurrence. We frst train and evaluate our model through cross-validation on an in-house TOF-MRA dataset comprising 284 subjects (170 females / 127 healthy controls / 157 patients with 198 aneurysms). On this dataset, our best model achieved a sensitivity of 83%, with False Positive (FP) rate of 0.8 per patient. To assess model generalizability, we then participated in a challenge for aneurysm detection with TOF-MRA data (93 patients, 20 controls, 125 aneurysms). On the public challenge, sensitivity was 68% (FP rate=2.5), ranking 4th/18 on the open leaderboard. We found no signifcant diference in sensitivity between aneurysm risk-of-rupture groups (p=0.75), locations (p=0.72), or sizes (p=0.15). Data, code and model weights are released under permissive licenses. We demonstrate that weak labels and anatomical knowledge can alleviate the necessity for prohibitively expensive voxel-wise annotations.

IVNov 27, 2020
An anatomically-informed 3D CNN for brain aneurysm classification with weak labels

Tommaso Di Noto, Guillaume Marie, Sébastien Tourbier et al.

A commonly adopted approach to carry out detection tasks in medical imaging is to rely on an initial segmentation. However, this approach strongly depends on voxel-wise annotations which are repetitive and time-consuming to draw for medical experts. An interesting alternative to voxel-wise masks are so-called "weak" labels: these can either be coarse or oversized annotations that are less precise, but noticeably faster to create. In this work, we address the task of brain aneurysm detection as a patch-wise binary classification with weak labels, in contrast to related studies that rather use supervised segmentation methods and voxel-wise delineations. Our approach comes with the non-trivial challenge of the data set creation: as for most focal diseases, anomalous patches (with aneurysm) are outnumbered by those showing no anomaly, and the two classes usually have different spatial distributions. To tackle this frequent scenario of inherently imbalanced, spatially skewed data sets, we propose a novel, anatomically-driven approach by using a multi-scale and multi-input 3D Convolutional Neural Network (CNN). We apply our model to 214 subjects (83 patients, 131 controls) who underwent Time-Of-Flight Magnetic Resonance Angiography (TOF-MRA) and presented a total of 111 unruptured cerebral aneurysms. We compare two strategies for negative patch sampling that have an increasing level of difficulty for the network and we show how this choice can strongly affect the results. To assess whether the added spatial information helps improving performances, we compare our anatomically-informed CNN with a baseline, spatially-agnostic CNN. When considering the more realistic and challenging scenario including vessel-like negative patches, the former model attains the highest classification results (accuracy$\simeq$95\%, AUROC$\simeq$0.95, AUPR$\simeq$0.71), thus outperforming the baseline.

IVOct 29, 2020
An automatic multi-tissue human fetal brain segmentation benchmark using the Fetal Tissue Annotation Dataset

Kelly Payette, Priscille de Dumast, Hamza Kebiri et al.

It is critical to quantitatively analyse the developing human fetal brain in order to fully understand neurodevelopment in both normal fetuses and those with congenital disorders. To facilitate this analysis, automatic multi-tissue fetal brain segmentation algorithms are needed, which in turn requires open databases of segmented fetal brains. Here we introduce a publicly available database of 50 manually segmented pathological and non-pathological fetal magnetic resonance brain volume reconstructions across a range of gestational ages (20 to 33 weeks) into 7 different tissue categories (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, deep grey matter, brainstem/spinal cord). In addition, we quantitatively evaluate the accuracy of several automatic multi-tissue segmentation algorithms of the developing human fetal brain. Four research groups participated, submitting a total of 10 algorithms, demonstrating the benefits the database for the development of automatic algorithms.

IVOct 23, 2020
Segmentation of the cortical plate in fetal brain MRI with a topological loss

Priscille de Dumast, Hamza Kebiri, Chirine Atat et al.

The fetal cortical plate undergoes drastic morphological changes throughout early in utero development that can be observed using magnetic resonance (MR) imaging. An accurate MR image segmentation, and more importantly a topologically correct delineation of the cortical gray matter, is a key baseline to perform further quantitative analysis of brain development. In this paper, we propose for the first time the integration of a topological constraint, as an additional loss function, to enhance the morphological consistency of a deep learning-based segmentation of the fetal cortical plate. We quantitatively evaluate our method on 18 fetal brain atlases ranging from 21 to 38 weeks of gestation, showing the significant benefits of our method through all gestational ages as compared to a baseline method. Furthermore, qualitative evaluation by three different experts on 130 randomly selected slices from 26 clinical MRIs evidences the out-performance of our method independently of the MR reconstruction quality.

IVAug 15, 2020
Automated Detection of Cortical Lesions in Multiple Sclerosis Patients with 7T MRI

Francesco La Rosa, Erin S Beck, Ahmed Abdulkadir et al.

The automated detection of cortical lesions (CLs) in patients with multiple sclerosis (MS) is a challenging task that, despite its clinical relevance, has received very little attention. Accurate detection of the small and scarce lesions requires specialized sequences and high or ultra-high field MRI. For supervised training based on multimodal structural MRI at 7T, two experts generated ground truth segmentation masks of 60 patients with 2014 CLs. We implemented a simplified 3D U-Net with three resolution levels (3D U-Net-). By increasing the complexity of the task (adding brain tissue segmentation), while randomly dropping input channels during training, we improved the performance compared to the baseline. Considering a minimum lesion size of 0.75 μL, we achieved a lesion-wise cortical lesion detection rate of 67% and a false positive rate of 42%. However, 393 (24%) of the lesions reported as false positives were post-hoc confirmed as potential or definite lesions by an expert. This indicates the potential of the proposed method to support experts in the tedious process of CL manual segmentation.

LGSep 10, 2018
Shallow vs deep learning architectures for white matter lesion segmentation in the early stages of multiple sclerosis

Francesco La Rosa, Mário João Fartaria, Tobias Kober et al.

In this work, we present a comparison of a shallow and a deep learning architecture for the automated segmentation of white matter lesions in MR images of multiple sclerosis patients. In particular, we train and test both methods on early stage disease patients, to verify their performance in challenging conditions, more similar to a clinical setting than what is typically provided in multiple sclerosis segmentation challenges. Furthermore, we evaluate a prototype naive combination of the two methods, which refines the final segmentation. All methods were trained on 32 patients, and the evaluation was performed on a pure test set of 73 cases. Results show low lesion-wise false positives (30%) for the deep learning architecture, whereas the shallow architecture yields the best Dice coefficient (63%) and volume difference (19%). Combining both shallow and deep architectures further improves the lesion-wise metrics (69% and 26% lesion-wise true and false positive rate, respectively).