Hugo J. Kuijf

h-index43

21papers

1,716citations

Novelty27%

AI Score36

Ranked #100,548 of 194,257 authors (top 52%)#33,775 in CV (top 57%)

21 Papers

7.6CVApr 26, 2023Code

Effect of latent space distribution on the segmentation of images with multiple annotations

Ishaan Bhat, Josien P. W. Pluim, Max A. Viergever et al.

We propose the Generalized Probabilistic U-Net, which extends the Probabilistic U-Net by allowing more general forms of the Gaussian distribution as the latent space distribution that can better approximate the uncertainty in the reference segmentations. We study the effect the choice of latent space distribution has on capturing the variation in the reference segmentations for lung tumors and white matter hyperintensities in the brain. We show that the choice of distribution affects the sample diversity of the predictions and their overlap with respect to the reference segmentations. We have made our implementation available at https://github.com/ishaanb92/GeneralizedProbabilisticUNet

8.1IVJun 22, 2022Code

Influence of uncertainty estimation techniques on false-positive reduction in liver lesion detection

Ishaan Bhat, Josien P. W. Pluim, Max A. Viergever et al.

Deep learning techniques show success in detecting objects in medical images, but still suffer from false-positive predictions that may hinder accurate diagnosis. The estimated uncertainty of the neural network output has been used to flag incorrect predictions. We study the role played by features computed from neural network uncertainty estimates and shape-based features computed from binary predictions in reducing false positives in liver lesion detection by developing a classification-based post-processing step for different uncertainty estimation methods. We demonstrate an improvement in the lesion detection performance of the neural network (with respect to F1-score) for all uncertainty estimation methods on two datasets, comprising abdominal MR and CT images, respectively. We show that features computed from neural network uncertainty estimates tend not to contribute much toward reducing false positives. Our results show that factors like class imbalance (true over false positive ratio) and shape-based features extracted from uncertainty maps play an important role in distinguishing false positive from true positive predictions. Our code can be found at https://github.com/ishaanb92/FPCPipeline.

8.1CVJul 26, 2022Code

Generalized Probabilistic U-Net for medical image segementation

Ishaan Bhat, Josien P. W. Pluim, Hugo J. Kuijf

We propose the Generalized Probabilistic U-Net, which extends the Probabilistic U-Net by allowing more general forms of the Gaussian distribution as the latent space distribution that can better approximate the uncertainty in the reference segmentations. We study the effect the choice of latent space distribution has on capturing the uncertainty in the reference segmentations using the LIDC-IDRI dataset. We show that the choice of distribution affects the sample diversity of the predictions and their overlap with respect to the reference segmentations. For the LIDC-IDRI dataset, we show that using a mixture of Gaussians results in a statistically significant improvement in the generalized energy distance (GED) metric with respect to the standard Probabilistic U-Net. We have made our implementation available at https://github.com/ishaanb92/GeneralizedProbabilisticUNet

16.4CVMar 30, 2023

Why is the winner the best?

Matthias Eisenmann, Annika Reinke, Vivienn Weru et al.

International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.

2.7IVJul 27, 2022

Future Unruptured Intracranial Aneurysm Growth Prediction using Mesh Convolutional Neural Networks

Kimberley M. Timmins, Maarten J. Kamphuis, Iris N. Vos et al.

The growth of unruptured intracranial aneurysms (UIAs) is a predictor of rupture. Therefore, for further imaging surveillance and treatment planning, it is important to be able to predict if an UIA is likely to grow based on an initial baseline Time-of-Flight MRA (TOF-MRA). It is known that the size and shape of UIAs are predictors of aneurysm growth and/or rupture. We perform a feasibility study of using a mesh convolutional neural network for future UIA growth prediction from baseline TOF-MRAs. We include 151 TOF-MRAs, with 169 UIAs where 49 UIAs were classified as growing and 120 as stable, based on the clinical definition of growth (>1 mm increase in size in follow-up scan). UIAs were segmented from TOF-MRAs and meshes were automatically generated. We investigate the input of both UIA mesh only and region-of-interest (ROI) meshes including UIA and surrounding parent vessels. We develop a classification model to predict UIAs that will grow or remain stable. The model consisted of a mesh convolutional neural network including additional novel input edge features of shape index and curvedness which describe the surface topology. It was investigated if input edge mid-point co-ordinates influenced the model performance. The model with highest AUC (63.8%) for growth prediction was using UIA meshes with input edge mid-point co-ordinate features (average F1 score = 62.3%, accuracy = 66.9%, sensitivity = 57.3%, specificity = 70.8%). We present a future UIA growth prediction model based on a mesh convolutional neural network with promising results.

2.4IVAug 5, 2021Code

MixLacune: Segmentation of lacunes of presumed vascular origin

Denis Kutnar, Bas H. M. van der Velden, Marta Girones Sanguesa et al.

Lacunes of presumed vascular origin are fluid-filled cavities of between 3 - 15 mm in diameter, visible on T1 and FLAIR brain MRI. Quantification of lacunes relies on manual annotation or semi-automatic / interactive approaches; and almost no automatic methods exist for this task. In this work, we present a two-stage approach to segment lacunes of presumed vascular origin: (1) detection with Mask R-CNN followed by (2) segmentation with a U-Net CNN. Data originates from Task 3 of the "Where is VALDO?" challenge and consists of 40 training subjects. We report the mean DICE on the training set of 0.83 and on the validation set of 0.84. Source code is available at: https://github.com/hjkuijf/MixLacune . The docker container hjkuijf/mixlacune can be pulled from https://hub.docker.com/r/hjkuijf/mixlacune .

4.4IVAug 3, 2021Code

MixMicrobleedNet: segmentation of cerebral microbleeds using nnU-Net

Hugo J. Kuijf

Cerebral microbleeds are small hypointense lesions visible on magnetic resonance imaging (MRI) with gradient echo, T2*, or susceptibility weighted (SWI) imaging. Assessment of cerebral microbleeds is mostly performed by visual inspection. The past decade has seen the rise of semi-automatic tools to assist with rating and more recently fully automatic tools for microbleed detection. In this work, we explore the use of nnU-Net as a fully automated tool for microbleed segmentation. Data was provided by the ``Where is VALDO?'' challenge of MICCAI 2021. The final method consists of nnU-Net in the ``3D full resolution U-Net'' configuration trained on all data (fold = `all'). No post-processing options of nnU-Net were used. Self-evaluation on the training data showed an estimated Dice of 0.80, false discovery rate of 0.16, and false negative rate of 0.15. Final evaluation on the test set of the VALDO challenge is pending. Visual inspection of the results showed that most of the reported false positives could be an actual microbleed that might have been missed during visual rating. Source code is available at: https://github.com/hjkuijf/MixMicrobleedNet . The docker container hjkuijf/mixmicrobleednet can be pulled from https://hub.docker.com/r/hjkuijf/mixmicrobleednet .

2.6CVJun 8, 2022

Progressive GANomaly: Anomaly detection with progressively growing GANs

Djennifer K. Madzia-Madzou, Hugo J. Kuijf

In medical imaging, obtaining large amounts of labeled data is often a hurdle, because annotations and pathologies are scarce. Anomaly detection is a method that is capable of detecting unseen abnormal data while only being trained on normal (unannotated) data. Several algorithms based on generative adversarial networks (GANs) exist to perform this task, yet certain limitations are in place because of the instability of GANs. This paper proposes a new method by combining an existing method, GANomaly, with progressively growing GANs. The latter is known to be more stable, considering its ability to generate high-resolution images. The method is tested using Fashion MNIST, Medical Out-of-Distribution Analysis Challenge (MOOD), and in-house brain MRI; using patches of sizes 16x16 and 32x32. Progressive GANomaly outperforms a one-class SVM or regular GANomaly on Fashion MNIST. Artificial anomalies are created in MOOD images with varying intensities and diameters. Progressive GANomaly detected the most anomalies with varying intensity and size. Additionally, the intermittent reconstructions are proven to be better from progressive GANomaly. On the in-house brain MRI dataset, regular GANomaly outperformed the other methods.

23.0CVDec 29, 2023Code

Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA

Kaiyuan Yang, Fabio Musio, Yihui Ma et al.

The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neurovascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two non-invasive angiographic imaging modalities, magnetic resonance angiography (MRA) and computed tomography angiography (CTA), but there exist limited datasets with annotations on CoW anatomy, especially for CTA. Therefore, we organized the TopCoW challenge with the release of an annotated CoW dataset. The TopCoW dataset is the first public dataset with voxel-level annotations for 13 CoW vessel components, enabled by virtual reality technology. It is also the first large dataset using 200 pairs of MRA and CTA from the same patients. As part of the benchmark, we invited submissions worldwide and attracted over 250 registered participants from six continents. The submissions were evaluated on both internal and external test datasets of 226 scans from over five centers. The top performing teams achieved over 90% Dice scores at segmenting the CoW components, over 80% F1 scores at detecting key CoW components, and over 70% balanced accuracy at classifying CoW variants for nearly all test sets. The best algorithms also showed clinical potential in classifying fetal-type posterior cerebral artery and locating aneurysms with CoW anatomy. TopCoW demonstrated the utility and versatility of CoW segmentation algorithms for a wide range of downstream clinical applications with explainability. The annotated datasets and best performing algorithms have been released as public Zenodo records to foster further methodological development and clinical tool building.

3.6CVAug 28, 2025

Occlusion Robustness of CLIP for Military Vehicle Classification

Jan Erik van Woerden, Gertjan Burghouts, Lotte Nijskens et al.

Vision-language models (VLMs) like CLIP enable zero-shot classification by aligning images and text in a shared embedding space, offering advantages for defense applications with scarce labeled data. However, CLIP's robustness in challenging military environments, with partial occlusion and degraded signal-to-noise ratio (SNR), remains underexplored. We investigate CLIP variants' robustness to occlusion using a custom dataset of 18 military vehicle classes and evaluate using Normalized Area Under the Curve (NAUC) across occlusion percentages. Four key insights emerge: (1) Transformer-based CLIP models consistently outperform CNNs, (2) fine-grained, dispersed occlusions degrade performance more than larger contiguous occlusions, (3) despite improved accuracy, performance of linear-probed models sharply drops at around 35% occlusion, (4) by finetuning the model's backbone, this performance drop occurs at more than 60% occlusion. These results underscore the importance of occlusion-specific augmentations during training and the need for further exploration into patch-level sensitivity and architectural resilience for real-world deployment of CLIP.

3.6CVJun 3, 2025

Deep Learning for Retinal Degeneration Assessment: A Comprehensive Analysis of the MARIO AMD Progression Challenge

Rachid Zeghlache, Ikram Brahim, Pierre-Henri Conze et al.

The MARIO challenge, held at MICCAI 2024, focused on advancing the automated detection and monitoring of age-related macular degeneration (AMD) through the analysis of optical coherence tomography (OCT) images. Designed to evaluate algorithmic performance in detecting neovascular activity changes within AMD, the challenge incorporated unique multi-modal datasets. The primary dataset, sourced from Brest, France, was used by participating teams to train and test their models. The final ranking was determined based on performance on this dataset. An auxiliary dataset from Algeria was used post-challenge to evaluate population and device shifts from submitted solutions. Two tasks were involved in the MARIO challenge. The first one was the classification of evolution between two consecutive 2D OCT B-scans. The second one was the prediction of future AMD evolution over three months for patients undergoing anti-vascular endothelial growth factor (VEGF) therapy. Thirty-five teams participated, with the top 12 finalists presenting their methods. This paper outlines the challenge's structure, tasks, data characteristics, and winning methodologies, setting a benchmark for AMD monitoring using OCT, infrared imaging, and clinical data (such as the number of visits, age, gender, etc.). The results of this challenge indicate that artificial intelligence (AI) performs as well as a physician in measuring AMD progression (Task 1) but is not yet able of predicting future evolution (Task 2).

3.6CVMay 1, 2025

Automated segmenta-on of pediatric neuroblastoma on multi-modal MRI: Results of the SPPIN challenge at MICCAI 2023

M. A. D. Buser, D. C. Simons, M. Fitski et al.

Surgery plays an important role within the treatment for neuroblastoma, a common pediatric cancer. This requires careful planning, often via magnetic resonance imaging (MRI)-based anatomical 3D models. However, creating these models is often time-consuming and user dependent. We organized the Surgical Planning in Pediatric Neuroblastoma (SPPIN) challenge, to stimulate developments on this topic, and set a benchmark for fully automatic segmentation of neuroblastoma on multi-model MRI. The challenge started with a training phase, where teams received 78 sets of MRI scans from 34 patients, consisting of both diagnostic and post-chemotherapy MRI scans. The final test phase, consisting of 18 MRI sets from 9 patients, determined the ranking of the teams. Ranking was based on the Dice similarity coefficient (Dice score), the 95th percentile of the Hausdorff distance (HD95) and the volumetric similarity (VS). The SPPIN challenge was hosted at MICCAI 2023. The final leaderboard consisted of 9 teams. The highest-ranking team achieved a median Dice score 0.82, a median HD95 of 7.69 mm and a VS of 0.91, utilizing a large, pretrained network called STU-Net. A significant difference for the segmentation results between diagnostic and post-chemotherapy MRI scans was observed (Dice = 0.89 vs Dice = 0.59, P = 0.01) for the highest-ranking team. SPPIN is the first medical segmentation challenge in extracranial pediatric oncology. The highest-ranking team used a large pre-trained network, suggesting that pretraining can be of use in small, heterogenous datasets. Although the results of the highest-ranking team were high for most patients, segmentation especially in small, pre-treated tumors were insufficient. Therefore, more reliable segmentation methods are needed to create clinically applicable models to aid surgical planning in pediatric neuroblastoma.

20.6IVMar 19, 2024

QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

Hongwei Bran Li, Fernando Navarro, Ivan Ezhov et al.

Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks.

4.4IVAug 5, 2021

MixMicrobleed: Multi-stage detection and segmentation of cerebral microbleeds

Marta Girones Sanguesa, Denis Kutnar, Bas H. M. van der Velden et al.

Cerebral microbleeds are small, dark, round lesions that can be visualised on T2*-weighted MRI or other sequences sensitive to susceptibility effects. In this work, we propose a multi-stage approach to both microbleed detection and segmentation. First, possible microbleed locations are detected with a Mask R-CNN technique. Second, at each possible microbleed location, a simple U-Net performs the final segmentation. This work used the 72 subjects as training data provided by the "Where is VALDO?" challenge of MICCAI 2021.

37.5IVJul 22, 2021

Explainable artificial intelligence (XAI) in deep learning-based medical image analysis

Bas H. M. van der Velden, Hugo J. Kuijf, Kenneth G. A. Gilhuijs et al.

With an increase in deep learning-based methods, the call for explainability of such methods grows, especially in high-stakes decision making areas such as medical image analysis. This survey presents an overview of eXplainable Artificial Intelligence (XAI) used in deep learning-based medical image analysis. A framework of XAI criteria is introduced to classify deep learning-based medical image analysis methods. Papers on XAI techniques in medical image analysis are then surveyed and categorized according to the framework and according to anatomical location. The paper concludes with an outlook of future opportunities for XAI in medical image analysis.

8.8IVJan 22, 2021

Automatic Cerebral Vessel Extraction in TOF-MRA Using Deep Learning

V. de Vos, K. M. Timmins, I. C. van der Schaaf et al.

Deep learning approaches may help radiologists in the early diagnosis and timely treatment of cerebrovascular diseases. Accurate cerebral vessel segmentation of Time-of-Flight Magnetic Resonance Angiographs (TOF-MRAs) is an essential step in this process. This study investigates deep learning approaches for automatic, fast and accurate cerebrovascular segmentation for TOF-MRAs. The performance of several data augmentation and selection methods for training a 2D and 3D U-Net for vessel segmentation was investigated in five experiments: a) without augmentation, b) Gaussian blur, c) rotation and flipping, d) Gaussian blur, rotation and flipping and e) different input patch sizes. All experiments were performed by patch-training both a 2D and 3D U-Net and predicted on a test set of MRAs. Ground truth was manually defined using an interactive threshold and region growing method. The performance was evaluated using the Dice Similarity Coefficient (DSC), Modified Hausdorff Distance and Volumetric Similarity, between the predicted images and the interactively defined ground truth. The segmentation performance of all trained networks on the test set was found to be good, with DSC scores ranging from 0.72 to 0.83. Both the 2D and 3D U-Net had the best segmentation performance with Gaussian blur, rotation and flipping compared to other experiments without augmentation or only one of those augmentation techniques. Additionally, training on larger patches or slices gave optimal segmentation results. In conclusion, vessel segmentation can be optimally performed on TOF-MRAs using a trained 3D U-Net on larger patches, where data augmentation including Gaussian blur, rotation and flipping was performed on the training data.

2.4IVJan 20, 2021

Variational Autoencoders with a Structural Similarity Loss in Time of Flight MRAs

Kimberley M. Timmins, Irene C. van der Schaaf, Ynte M. Ruigrok et al.

Time-of-Flight Magnetic Resonance Angiographs (TOF-MRAs) enable visualization and analysis of cerebral arteries. This analysis may indicate normal variation of the configuration of the cerebrovascular system or vessel abnormalities, such as aneurysms. A model would be useful to represent normal cerebrovascular structure and variabilities in a healthy population and to differentiate from abnormalities. Current anomaly detection using autoencoding convolutional neural networks usually use a voxelwise mean-error for optimization. We propose optimizing a variational-autoencoder (VAE) with structural similarity loss (SSIM) for TOF-MRA reconstruction. A patch-trained 2D fully-convolutional VAE was optimized for TOF-MRA reconstruction by comparing vessel segmentations of original and reconstructed MRAs. The method was trained and tested on two datasets: the IXI dataset, and a subset from the ADAM challenge. Both trained networks were tested on a dataset including subjects with aneurysms. We compared VAE optimization with L2-loss and SSIM-loss. Performance was evaluated between original and reconstructed MRAs using mean square error, mean-SSIM, peak-signal-to-noise-ratio and dice similarity index (DSI) of segmented vessels. The L2-optimized VAE outperforms SSIM, with improved reconstruction metrics and DSIs for both datasets. Optimization using SSIM performed best for visual image quality, but with discrepancy in quantitative reconstruction and vascular segmentation. The larger, more diverse IXI dataset had overall better performance. Reconstruction metrics, including SSIM, were lower for MRAs including aneurysms. A SSIM-optimized VAE improved the visual perceptive image quality of TOF-MRA reconstructions. A L2-optimized VAE performed best for TOF-MRA reconstruction, where the vascular segmentation is important. SSIM is a potential metric for anomaly detection of MRAs.

7.5IVJan 12, 2021

Using uncertainty estimation to reduce false positives in liver lesion detection

Ishaan Bhat, Hugo J. Kuijf, Veronika Cheplygina et al.

Despite the successes of deep learning techniques at detecting objects in medical images, false positive detections occur which may hinder an accurate diagnosis. We propose a technique to reduce false positive detections made by a neural network using an SVM classifier trained with features derived from the uncertainty map of the neural network prediction. We demonstrate the effectiveness of this method for the detection of liver lesions on a dataset of abdominal MR images. We find that the use of a dropout rate of 0.5 produces the least number of false positives in the neural network predictions and the trained classifier filters out approximately 90% of these false positives detections in the test-set.

24.5CVApr 1, 2019Code

Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation Challenge

Hugo J. Kuijf, J. Matthijs Biesbroek, Jeroen de Bresser et al.

Quantification of cerebral white matter hyperintensities (WMH) of presumed vascular origin is of key importance in many neurological research studies. Currently, measurements are often still obtained from manual segmentations on brain MR images, which is a laborious procedure. Automatic WMH segmentation methods exist, but a standardized comparison of the performance of such methods is lacking. We organized a scientific challenge, in which developers could evaluate their method on a standardized multi-center/-scanner image dataset, giving an objective comparison: the WMH Segmentation Challenge (https://wmh.isi.uu.nl/). Sixty T1+FLAIR images from three MR scanners were released with manual WMH segmentations for training. A test set of 110 images from five MR scanners was used for evaluation. Segmentation methods had to be containerized and submitted to the challenge organizers. Five evaluation metrics were used to rank the methods: (1) Dice similarity coefficient, (2) modified Hausdorff distance (95th percentile), (3) absolute log-transformed volume difference, (4) sensitivity for detecting individual lesions, and (5) F1-score for individual lesions. Additionally, methods were ranked on their inter-scanner robustness. Twenty participants submitted their method for evaluation. This paper provides a detailed analysis of the results. In brief, there is a cluster of four methods that rank significantly better than the other methods, with one clear winner. The inter-scanner robustness ranking shows that not all methods generalize to unseen scanners. The challenge remains open for future submissions and provides a public platform for method evaluation.

13.3CVDec 30, 2018

Cascaded V-Net using ROI masks for brain tumor segmentation

Adrià Casamitjana, Marcel Catà, Irina Sánchez et al.

In this work we approach the brain tumor segmentation problem with a cascade of two CNNs inspired in the V-Net architecture \cite{VNet}, reformulating residual connections and making use of ROI masks to constrain the networks to train only on relevant voxels. This architecture allows dense training on problems with highly skewed class distributions, such as brain tumor segmentation, by focusing training only on the vecinity of the tumor area. We report results on BraTS2017 Training and Validation sets.

0.9CVNov 22, 2018

Response monitoring of breast cancer on DCE-MRI using convolutional neural network-generated seed points and constrained volume growing

Bas H. M. van der Velden, Bob D. de Vos, Claudette E. Loo et al.

Response of breast cancer to neoadjuvant chemotherapy (NAC) can be monitored using the change in visible tumor on magnetic resonance imaging (MRI). In our current workflow, seed points are manually placed in areas of enhancement likely to contain cancer. A constrained volume growing method uses these manually placed seed points as input and generates a tumor segmentation. This method is rigorously validated using complete pathological embedding. In this study, we propose to exploit deep learning for fast and automatic seed point detection, replacing manual seed point placement in our existing and well-validated workflow. The seed point generator was developed in early breast cancer patients with pathology-proven segmentations (N=100), operated shortly after MRI. It consisted of an ensemble of three independently trained fully convolutional dilated neural networks that classified breast voxels as tumor or non-tumor. Subsequently, local maxima were used as seed points for volume growing in patients receiving NAC (N=10). The percentage of tumor volume change was evaluated against semi-automatic segmentations. The primary cancer was localized in 95% of the tumors at the cost of 0.9 false positive per patient. False positives included focally enhancing regions of unknown origin and parts of the intramammary blood vessels. Volume growing from the seed points showed a median tumor volume decrease of 70% (interquartile range: 50%-77%), comparable to the semi-automatic segmentations (median: 70%, interquartile range 23%-76%). To conclude, a fast and automatic seed point generator was developed, fully automating a well-validated semi-automatic workflow for response monitoring of breast cancer to neoadjuvant chemotherapy.