CVApr 26, 2023Code
Effect of latent space distribution on the segmentation of images with multiple annotationsIshaan Bhat, Josien P. W. Pluim, Max A. Viergever et al.
We propose the Generalized Probabilistic U-Net, which extends the Probabilistic U-Net by allowing more general forms of the Gaussian distribution as the latent space distribution that can better approximate the uncertainty in the reference segmentations. We study the effect the choice of latent space distribution has on capturing the variation in the reference segmentations for lung tumors and white matter hyperintensities in the brain. We show that the choice of distribution affects the sample diversity of the predictions and their overlap with respect to the reference segmentations. We have made our implementation available at https://github.com/ishaanb92/GeneralizedProbabilisticUNet
IVJun 22, 2022Code
Influence of uncertainty estimation techniques on false-positive reduction in liver lesion detectionIshaan Bhat, Josien P. W. Pluim, Max A. Viergever et al.
Deep learning techniques show success in detecting objects in medical images, but still suffer from false-positive predictions that may hinder accurate diagnosis. The estimated uncertainty of the neural network output has been used to flag incorrect predictions. We study the role played by features computed from neural network uncertainty estimates and shape-based features computed from binary predictions in reducing false positives in liver lesion detection by developing a classification-based post-processing step for different uncertainty estimation methods. We demonstrate an improvement in the lesion detection performance of the neural network (with respect to F1-score) for all uncertainty estimation methods on two datasets, comprising abdominal MR and CT images, respectively. We show that features computed from neural network uncertainty estimates tend not to contribute much toward reducing false positives. Our results show that factors like class imbalance (true over false positive ratio) and shape-based features extracted from uncertainty maps play an important role in distinguishing false positive from true positive predictions. Our code can be found at https://github.com/ishaanb92/FPCPipeline.
CVJul 26, 2022Code
Generalized Probabilistic U-Net for medical image segementationIshaan Bhat, Josien P. W. Pluim, Hugo J. Kuijf
We propose the Generalized Probabilistic U-Net, which extends the Probabilistic U-Net by allowing more general forms of the Gaussian distribution as the latent space distribution that can better approximate the uncertainty in the reference segmentations. We study the effect the choice of latent space distribution has on capturing the uncertainty in the reference segmentations using the LIDC-IDRI dataset. We show that the choice of distribution affects the sample diversity of the predictions and their overlap with respect to the reference segmentations. For the LIDC-IDRI dataset, we show that using a mixture of Gaussians results in a statistically significant improvement in the generalized energy distance (GED) metric with respect to the standard Probabilistic U-Net. We have made our implementation available at https://github.com/ishaanb92/GeneralizedProbabilisticUNet
CVAug 15, 2022
Where is VALDO? VAscular Lesions Detection and segmentatiOn challenge at MICCAI 2021Carole H. Sudre, Kimberlin Van Wijnen, Florian Dubost et al.
Imaging markers of cerebral small vessel disease provide valuable information on brain health, but their manual assessment is time-consuming and hampered by substantial intra- and interrater variability. Automated rating may benefit biomedical research, as well as clinical assessment, but diagnostic reliability of existing algorithms is unknown. Here, we present the results of the \textit{VAscular Lesions DetectiOn and Segmentation} (\textit{Where is VALDO?}) challenge that was run as a satellite event at the international conference on Medical Image Computing and Computer Aided Intervention (MICCAI) 2021. This challenge aimed to promote the development of methods for automated detection and segmentation of small and sparse imaging markers of cerebral small vessel disease, namely enlarged perivascular spaces (EPVS) (Task 1), cerebral microbleeds (Task 2) and lacunes of presumed vascular origin (Task 3) while leveraging weak and noisy labels. Overall, 12 teams participated in the challenge proposing solutions for one or more tasks (4 for Task 1 - EPVS, 9 for Task 2 - Microbleeds and 6 for Task 3 - Lacunes). Multi-cohort data was used in both training and evaluation. Results showed a large variability in performance both across teams and across tasks, with promising results notably for Task 1 - EPVS and Task 2 - Microbleeds and not practically useful results yet for Task 3 - Lacunes. It also highlighted the performance inconsistency across cases that may deter use at an individual level, while still proving useful at a population level.
IVJul 27, 2022
Future Unruptured Intracranial Aneurysm Growth Prediction using Mesh Convolutional Neural NetworksKimberley M. Timmins, Maarten J. Kamphuis, Iris N. Vos et al.
The growth of unruptured intracranial aneurysms (UIAs) is a predictor of rupture. Therefore, for further imaging surveillance and treatment planning, it is important to be able to predict if an UIA is likely to grow based on an initial baseline Time-of-Flight MRA (TOF-MRA). It is known that the size and shape of UIAs are predictors of aneurysm growth and/or rupture. We perform a feasibility study of using a mesh convolutional neural network for future UIA growth prediction from baseline TOF-MRAs. We include 151 TOF-MRAs, with 169 UIAs where 49 UIAs were classified as growing and 120 as stable, based on the clinical definition of growth (>1 mm increase in size in follow-up scan). UIAs were segmented from TOF-MRAs and meshes were automatically generated. We investigate the input of both UIA mesh only and region-of-interest (ROI) meshes including UIA and surrounding parent vessels. We develop a classification model to predict UIAs that will grow or remain stable. The model consisted of a mesh convolutional neural network including additional novel input edge features of shape index and curvedness which describe the surface topology. It was investigated if input edge mid-point co-ordinates influenced the model performance. The model with highest AUC (63.8%) for growth prediction was using UIA meshes with input edge mid-point co-ordinate features (average F1 score = 62.3%, accuracy = 66.9%, sensitivity = 57.3%, specificity = 70.8%). We present a future UIA growth prediction model based on a mesh convolutional neural network with promising results.
IVAug 5, 2021Code
MixLacune: Segmentation of lacunes of presumed vascular originDenis Kutnar, Bas H. M. van der Velden, Marta Girones Sanguesa et al.
Lacunes of presumed vascular origin are fluid-filled cavities of between 3 - 15 mm in diameter, visible on T1 and FLAIR brain MRI. Quantification of lacunes relies on manual annotation or semi-automatic / interactive approaches; and almost no automatic methods exist for this task. In this work, we present a two-stage approach to segment lacunes of presumed vascular origin: (1) detection with Mask R-CNN followed by (2) segmentation with a U-Net CNN. Data originates from Task 3 of the "Where is VALDO?" challenge and consists of 40 training subjects. We report the mean DICE on the training set of 0.83 and on the validation set of 0.84. Source code is available at: https://github.com/hjkuijf/MixLacune . The docker container hjkuijf/mixlacune can be pulled from https://hub.docker.com/r/hjkuijf/mixlacune .
IVAug 3, 2021Code
MixMicrobleedNet: segmentation of cerebral microbleeds using nnU-NetHugo J. Kuijf
Cerebral microbleeds are small hypointense lesions visible on magnetic resonance imaging (MRI) with gradient echo, T2*, or susceptibility weighted (SWI) imaging. Assessment of cerebral microbleeds is mostly performed by visual inspection. The past decade has seen the rise of semi-automatic tools to assist with rating and more recently fully automatic tools for microbleed detection. In this work, we explore the use of nnU-Net as a fully automated tool for microbleed segmentation. Data was provided by the ``Where is VALDO?'' challenge of MICCAI 2021. The final method consists of nnU-Net in the ``3D full resolution U-Net'' configuration trained on all data (fold = `all'). No post-processing options of nnU-Net were used. Self-evaluation on the training data showed an estimated Dice of 0.80, false discovery rate of 0.16, and false negative rate of 0.15. Final evaluation on the test set of the VALDO challenge is pending. Visual inspection of the results showed that most of the reported false positives could be an actual microbleed that might have been missed during visual rating. Source code is available at: https://github.com/hjkuijf/MixMicrobleedNet . The docker container hjkuijf/mixmicrobleednet can be pulled from https://hub.docker.com/r/hjkuijf/mixmicrobleednet .
CVJun 8, 2022
Progressive GANomaly: Anomaly detection with progressively growing GANsDjennifer K. Madzia-Madzou, Hugo J. Kuijf
In medical imaging, obtaining large amounts of labeled data is often a hurdle, because annotations and pathologies are scarce. Anomaly detection is a method that is capable of detecting unseen abnormal data while only being trained on normal (unannotated) data. Several algorithms based on generative adversarial networks (GANs) exist to perform this task, yet certain limitations are in place because of the instability of GANs. This paper proposes a new method by combining an existing method, GANomaly, with progressively growing GANs. The latter is known to be more stable, considering its ability to generate high-resolution images. The method is tested using Fashion MNIST, Medical Out-of-Distribution Analysis Challenge (MOOD), and in-house brain MRI; using patches of sizes 16x16 and 32x32. Progressive GANomaly outperforms a one-class SVM or regular GANomaly on Fashion MNIST. Artificial anomalies are created in MOOD images with varying intensities and diameters. Progressive GANomaly detected the most anomalies with varying intensity and size. Additionally, the intermittent reconstructions are proven to be better from progressive GANomaly. On the in-house brain MRI dataset, regular GANomaly outperformed the other methods.
CVDec 29, 2023
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRAKaiyuan Yang, Fabio Musio, Yihui Ma et al.
The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neurovascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two non-invasive angiographic imaging modalities, magnetic resonance angiography (MRA) and computed tomography angiography (CTA), but there exist limited datasets with annotations on CoW anatomy, especially for CTA. Therefore, we organized the TopCoW challenge with the release of an annotated CoW dataset. The TopCoW dataset is the first public dataset with voxel-level annotations for 13 CoW vessel components, enabled by virtual reality technology. It is also the first large dataset using 200 pairs of MRA and CTA from the same patients. As part of the benchmark, we invited submissions worldwide and attracted over 250 registered participants from six continents. The submissions were evaluated on both internal and external test datasets of 226 scans from over five centers. The top performing teams achieved over 90% Dice scores at segmenting the CoW components, over 80% F1 scores at detecting key CoW components, and over 70% balanced accuracy at classifying CoW variants for nearly all test sets. The best algorithms also showed clinical potential in classifying fetal-type posterior cerebral artery and locating aneurysms with CoW anatomy. TopCoW demonstrated the utility and versatility of CoW segmentation algorithms for a wide range of downstream clinical applications with explainability. The annotated datasets and best performing algorithms have been released as public Zenodo records to foster further methodological development and clinical tool building.
CVAug 28, 2025
Occlusion Robustness of CLIP for Military Vehicle ClassificationJan Erik van Woerden, Gertjan Burghouts, Lotte Nijskens et al.
Vision-language models (VLMs) like CLIP enable zero-shot classification by aligning images and text in a shared embedding space, offering advantages for defense applications with scarce labeled data. However, CLIP's robustness in challenging military environments, with partial occlusion and degraded signal-to-noise ratio (SNR), remains underexplored. We investigate CLIP variants' robustness to occlusion using a custom dataset of 18 military vehicle classes and evaluate using Normalized Area Under the Curve (NAUC) across occlusion percentages. Four key insights emerge: (1) Transformer-based CLIP models consistently outperform CNNs, (2) fine-grained, dispersed occlusions degrade performance more than larger contiguous occlusions, (3) despite improved accuracy, performance of linear-probed models sharply drops at around 35% occlusion, (4) by finetuning the model's backbone, this performance drop occurs at more than 60% occlusion. These results underscore the importance of occlusion-specific augmentations during training and the need for further exploration into patch-level sensitivity and architectural resilience for real-world deployment of CLIP.
CVJun 3, 2025
Deep Learning for Retinal Degeneration Assessment: A Comprehensive Analysis of the MARIO AMD Progression ChallengeRachid Zeghlache, Ikram Brahim, Pierre-Henri Conze et al.
The MARIO challenge, held at MICCAI 2024, focused on advancing the automated detection and monitoring of age-related macular degeneration (AMD) through the analysis of optical coherence tomography (OCT) images. Designed to evaluate algorithmic performance in detecting neovascular activity changes within AMD, the challenge incorporated unique multi-modal datasets. The primary dataset, sourced from Brest, France, was used by participating teams to train and test their models. The final ranking was determined based on performance on this dataset. An auxiliary dataset from Algeria was used post-challenge to evaluate population and device shifts from submitted solutions. Two tasks were involved in the MARIO challenge. The first one was the classification of evolution between two consecutive 2D OCT B-scans. The second one was the prediction of future AMD evolution over three months for patients undergoing anti-vascular endothelial growth factor (VEGF) therapy. Thirty-five teams participated, with the top 12 finalists presenting their methods. This paper outlines the challenge's structure, tasks, data characteristics, and winning methodologies, setting a benchmark for AMD monitoring using OCT, infrared imaging, and clinical data (such as the number of visits, age, gender, etc.). The results of this challenge indicate that artificial intelligence (AI) performs as well as a physician in measuring AMD progression (Task 1) but is not yet able of predicting future evolution (Task 2).
IVMar 19, 2024
QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation ChallengeHongwei Bran Li, Fernando Navarro, Ivan Ezhov et al.
Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks.
IVAug 5, 2021
MixMicrobleed: Multi-stage detection and segmentation of cerebral microbleedsMarta Girones Sanguesa, Denis Kutnar, Bas H. M. van der Velden et al.
Cerebral microbleeds are small, dark, round lesions that can be visualised on T2*-weighted MRI or other sequences sensitive to susceptibility effects. In this work, we propose a multi-stage approach to both microbleed detection and segmentation. First, possible microbleed locations are detected with a Mask R-CNN technique. Second, at each possible microbleed location, a simple U-Net performs the final segmentation. This work used the 72 subjects as training data provided by the "Where is VALDO?" challenge of MICCAI 2021.
IVJul 22, 2021
Explainable artificial intelligence (XAI) in deep learning-based medical image analysisBas H. M. van der Velden, Hugo J. Kuijf, Kenneth G. A. Gilhuijs et al.
With an increase in deep learning-based methods, the call for explainability of such methods grows, especially in high-stakes decision making areas such as medical image analysis. This survey presents an overview of eXplainable Artificial Intelligence (XAI) used in deep learning-based medical image analysis. A framework of XAI criteria is introduced to classify deep learning-based medical image analysis methods. Papers on XAI techniques in medical image analysis are then surveyed and categorized according to the framework and according to anatomical location. The paper concludes with an outlook of future opportunities for XAI in medical image analysis.
IVJan 20, 2021
Variational Autoencoders with a Structural Similarity Loss in Time of Flight MRAsKimberley M. Timmins, Irene C. van der Schaaf, Ynte M. Ruigrok et al.
Time-of-Flight Magnetic Resonance Angiographs (TOF-MRAs) enable visualization and analysis of cerebral arteries. This analysis may indicate normal variation of the configuration of the cerebrovascular system or vessel abnormalities, such as aneurysms. A model would be useful to represent normal cerebrovascular structure and variabilities in a healthy population and to differentiate from abnormalities. Current anomaly detection using autoencoding convolutional neural networks usually use a voxelwise mean-error for optimization. We propose optimizing a variational-autoencoder (VAE) with structural similarity loss (SSIM) for TOF-MRA reconstruction. A patch-trained 2D fully-convolutional VAE was optimized for TOF-MRA reconstruction by comparing vessel segmentations of original and reconstructed MRAs. The method was trained and tested on two datasets: the IXI dataset, and a subset from the ADAM challenge. Both trained networks were tested on a dataset including subjects with aneurysms. We compared VAE optimization with L2-loss and SSIM-loss. Performance was evaluated between original and reconstructed MRAs using mean square error, mean-SSIM, peak-signal-to-noise-ratio and dice similarity index (DSI) of segmented vessels. The L2-optimized VAE outperforms SSIM, with improved reconstruction metrics and DSIs for both datasets. Optimization using SSIM performed best for visual image quality, but with discrepancy in quantitative reconstruction and vascular segmentation. The larger, more diverse IXI dataset had overall better performance. Reconstruction metrics, including SSIM, were lower for MRAs including aneurysms. A SSIM-optimized VAE improved the visual perceptive image quality of TOF-MRA reconstructions. A L2-optimized VAE performed best for TOF-MRA reconstruction, where the vascular segmentation is important. SSIM is a potential metric for anomaly detection of MRAs.
IVJan 12, 2021
Using uncertainty estimation to reduce false positives in liver lesion detectionIshaan Bhat, Hugo J. Kuijf, Veronika Cheplygina et al.
Despite the successes of deep learning techniques at detecting objects in medical images, false positive detections occur which may hinder an accurate diagnosis. We propose a technique to reduce false positive detections made by a neural network using an SVM classifier trained with features derived from the uncertainty map of the neural network prediction. We demonstrate the effectiveness of this method for the detection of liver lesions on a dataset of abdominal MR images. We find that the use of a dropout rate of 0.5 produces the least number of false positives in the neural network predictions and the trained classifier filters out approximately 90% of these false positives detections in the test-set.
IVOct 15, 2019
Liver segmentation and metastases detection in MR images using convolutional neural networksMariëlle J. A. Jansen, Hugo J. Kuijf, Maarten Niekel et al.
Primary tumors have a high likelihood of developing metastases in the liver and early detection of these metastases is crucial for patient outcome. We propose a method based on convolutional neural networks (CNN) to detect liver metastases. First, the liver was automatically segmented using the six phases of abdominal dynamic contrast enhanced (DCE) MR images. Next, DCE-MR and diffusion weighted (DW) MR images are used for metastases detection within the liver mask. The liver segmentations have a median Dice similarity coefficient of 0.95 compared with manual annotations. The metastases detection method has a sensitivity of 99.8% with a median of 2 false positives per image. The combination of the two MR sequences in a dual pathway network is proven valuable for the detection of liver metastases. In conclusion, a high quality liver segmentation can be obtained in which we can successfully detect liver metastases.
IVAug 22, 2019
Optimal input configuration of dynamic contrast enhanced MRI in convolutional neural networks for liver segmentationMariëlle J. A. Jansen, Hugo J. Kuijf, Josien P. W. Pluim
Most MRI liver segmentation methods use a structural 3D scan as input, such as a T1 or T2 weighted scan. Segmentation performance may be improved by utilizing both structural and functional information, as contained in dynamic contrast enhanced (DCE) MR series. Dynamic information can be incorporated in a segmentation method based on convolutional neural networks in a number of ways. In this study, the optimal input configuration of DCE MR images for convolutional neural networks (CNNs) is studied. The performance of three different input configurations for CNNs is studied for a liver segmentation task. The three configurations are I) one phase image of the DCE-MR series as input image; II) the separate phases of the DCE-MR as input images; and III) the separate phases of the DCE-MR as channels of one input image. The three input configurations are fed into a dilated fully convolutional network and into a small U-net. The CNNs were trained using 19 annotated DCE-MR series and tested on another 19 annotated DCE-MR series. The performance of the three input configurations for both networks is evaluated against manual annotations. The results show that both neural networks perform better when the separate phases of the DCE-MR series are used as channels of an input image in comparison to one phase as input image or the separate phases as input images. No significant difference between the performances of the two network architectures was found for the separate phases as channels of an input image.
CVApr 1, 2019
Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation ChallengeHugo J. Kuijf, J. Matthijs Biesbroek, Jeroen de Bresser et al.
Quantification of cerebral white matter hyperintensities (WMH) of presumed vascular origin is of key importance in many neurological research studies. Currently, measurements are often still obtained from manual segmentations on brain MR images, which is a laborious procedure. Automatic WMH segmentation methods exist, but a standardized comparison of the performance of such methods is lacking. We organized a scientific challenge, in which developers could evaluate their method on a standardized multi-center/-scanner image dataset, giving an objective comparison: the WMH Segmentation Challenge (https://wmh.isi.uu.nl/). Sixty T1+FLAIR images from three MR scanners were released with manual WMH segmentations for training. A test set of 110 images from five MR scanners was used for evaluation. Segmentation methods had to be containerized and submitted to the challenge organizers. Five evaluation metrics were used to rank the methods: (1) Dice similarity coefficient, (2) modified Hausdorff distance (95th percentile), (3) absolute log-transformed volume difference, (4) sensitivity for detecting individual lesions, and (5) F1-score for individual lesions. Additionally, methods were ranked on their inter-scanner robustness. Twenty participants submitted their method for evaluation. This paper provides a detailed analysis of the results. In brief, there is a cluster of four methods that rank significantly better than the other methods, with one clear winner. The inter-scanner robustness ranking shows that not all methods generalize to unseen scanners. The challenge remains open for future submissions and provides a public platform for method evaluation.
CVNov 22, 2018
Response monitoring of breast cancer on DCE-MRI using convolutional neural network-generated seed points and constrained volume growingBas H. M. van der Velden, Bob D. de Vos, Claudette E. Loo et al.
Response of breast cancer to neoadjuvant chemotherapy (NAC) can be monitored using the change in visible tumor on magnetic resonance imaging (MRI). In our current workflow, seed points are manually placed in areas of enhancement likely to contain cancer. A constrained volume growing method uses these manually placed seed points as input and generates a tumor segmentation. This method is rigorously validated using complete pathological embedding. In this study, we propose to exploit deep learning for fast and automatic seed point detection, replacing manual seed point placement in our existing and well-validated workflow. The seed point generator was developed in early breast cancer patients with pathology-proven segmentations (N=100), operated shortly after MRI. It consisted of an ensemble of three independently trained fully convolutional dilated neural networks that classified breast voxels as tumor or non-tumor. Subsequently, local maxima were used as seed points for volume growing in patients receiving NAC (N=10). The percentage of tumor volume change was evaluated against semi-automatic segmentations. The primary cancer was localized in 95% of the tumors at the cost of 0.9 false positive per patient. False positives included focally enhancing regions of unknown origin and parts of the intramammary blood vessels. Volume growing from the seed points showed a median tumor volume decrease of 70% (interquartile range: 50%-77%), comparable to the semi-automatic segmentations (median: 70%, interquartile range 23%-76%). To conclude, a fast and automatic seed point generator was developed, fully automating a well-validated semi-automatic workflow for response monitoring of breast cancer to neoadjuvant chemotherapy.