Ivana Išgum

CV
h-index82
40papers
3,358citations
Novelty46%
AI Score56

40 Papers

IVMay 24, 2022
Generative Models for Reproducible Coronary Calcium Scoring

Sanne G. M. van Velzen, Bob D. de Vos, Julia M. H. Noothout et al.

Purpose: Coronary artery calcium (CAC) score, i.e. the amount of CAC quantified in CT, is a strong and independent predictor of coronary heart disease (CHD) events. However, CAC scoring suffers from limited interscan reproducibility, which is mainly due to the clinical definition requiring application of a fixed intensity level threshold for segmentation of calcifications. This limitation is especially pronounced in non-ECG-synchronized CT where lesions are more impacted by cardiac motion and partial volume effects. Therefore, we propose a CAC quantification method that does not require a threshold for segmentation of CAC. Approach: Our method utilizes a generative adversarial network where a CT with CAC is decomposed into an image without CAC and an image showing only CAC. The method, using a CycleGAN, was trained using 626 low-dose chest CTs and 514 radiotherapy treatment planning CTs. Interscan reproducibility was compared to clinical calcium scoring in radiotherapy treatment planning CTs of 1,662 patients, each having two scans. Results: A lower relative interscan difference in CAC mass was achieved by the proposed method: 47% compared to 89% manual clinical calcium scoring. The intraclass correlation coefficient of Agatston scores was 0.96 for the proposed method compared to 0.91 for automatic clinical calcium scoring. Conclusions: The increased interscan reproducibility achieved by our method may lead to increased reliability of CHD risk categorization and improved accuracy of CHD event prediction.

IVMay 19Code
Disentangling Sampling from Training Budget in Class-Imbalanced CT Body Composition Segmentation

Iason Skylitsis, Dimitrios Karkalousos, Ivana Išgum

Class imbalance is a fundamental challenge in medical image segmentation, where frequent classes typically dominate training at the expense of rare classes. Loss-based approaches mitigate imbalance by reweighting the per-pixel loss within the batch, while sampling strategies control which images enter the batch. Yet neither explicitly controls which classes appear within the batch, leaving rare-class exposure only partially rebalanced. In this work, we adopt episodic sampling from few-shot learning to promote class-balanced batch construction in a fully supervised setting. We decouple episodic sampling from its conventional metric-learning context and evaluate it in body composition segmentation in CT. We compare episodic sampling against random and weighted sampling on nine muscle and adipose tissues, derived from 210 scans of the public SAROS dataset. Training is performed under full- and low-data regimes, with additional comparisons under matched training iteration budgets. Under full-data training, all three strategies performed comparably (mean Dice 0.882 for episodic, 0.878 for random and weighted). Under low-data training, episodic sampling outperformed random and weighted (0.787 vs. 0.758 and 0.762), driven by a 12-fold difference in training iterations. Under matched training budgets, random and weighted overfit earlier, while episodic improved for approximately three times more iterations before plateauing. Our findings identify the training iteration budget as under-recognized confound in sampling strategies, motivating iteration-aware evaluation protocols for small datasets. Furthermore, the residual advantage of episodic sampling is consistent with an implicit regularization effect of class-balanced batches, offering a low-cost, model-agnostic strategy for class-imbalanced medical image segmentation. Code is available at https://github.com/iasonsky/episodic-sampling.

IVNov 24, 2023
Deep Learning for Automatic Strain Quantification in Arrhythmogenic Right Ventricular Cardiomyopathy

Laura Alvarez-Florez, Jörg Sander, Mimount Bourfiss et al.

Quantification of cardiac motion with cine Cardiac Magnetic Resonance Imaging (CMRI) is an integral part of arrhythmogenic right ventricular cardiomyopathy (ARVC) diagnosis. Yet, the expert evaluation of motion abnormalities with CMRI is a challenging task. To automatically assess cardiac motion, we register CMRIs from different time points of the cardiac cycle using Implicit Neural Representations (INRs) and perform a biomechanically informed regularization inspired by the myocardial incompressibility assumption. To enhance the registration performance, our method first rectifies the inter-slice misalignment inherent to CMRI by performing a rigid registration guided by the long-axis views, and then increases the through-plane resolution using an unsupervised deep learning super-resolution approach. Finally, we propose to synergically combine information from short-axis and 4-chamber long-axis views, along with an initialization to incorporate information from multiple cardiac time points. Thereafter, to quantify cardiac motion, we calculate global and segmental strain over a cardiac cycle and compute the peak strain. The evaluation of the method is performed on a dataset of cine CMRI scans from 47 ARVC patients and 67 controls. Our results show that inter-slice alignment and generation of super-resolved volumes combined with joint analysis of the two cardiac views, notably improves registration performance. Furthermore, the proposed initialization yields more physiologically plausible registrations. The significant differences in the peak strain, discerned between the ARVC patients and healthy controls suggest that automated motion quantification methods may assist in diagnosis and provide further understanding of disease-specific alterations of cardiac motion.

IVAug 9, 2023
Deep Learning-Based Prediction of Fractional Flow Reserve along the Coronary Artery

Nils Hampe, Sanne G. M. van Velzen, Jean-Paul Aben et al.

Functionally significant coronary artery disease (CAD) is caused by plaque buildup in the coronary arteries, potentially leading to narrowing of the arterial lumen, i.e. coronary stenosis, that significantly obstructs blood flow to the myocardium. The current reference for establishing the presence of a functionally significant stenosis is invasive fractional flow reserve (FFR) measurement. To avoid invasive measurements, non-invasive prediction of FFR from coronary CT angiography (CCTA) has emerged. For this, machine learning approaches, characterized by fast inference, are increasingly developed. However, these methods predict a single FFR value per artery i.e. they don't provide information about the stenosis location or treatment strategy. We propose a deep learning-based method to predict the FFR along the artery from CCTA scans. This study includes CCTA images of 110 patients who underwent invasive FFR pullback measurement in 112 arteries. First, a multi planar reconstruction (MPR) of the artery is fed to a variational autoencoder to characterize the artery, i.e. through the lumen area and unsupervised artery encodings. Thereafter, a convolutional neural network (CNN) predicts the FFR along the artery. The CNN is supervised by multiple loss functions, notably a loss function inspired by the Earth Mover's Distance (EMD) to predict the correct location of FFR drops and a histogram-based loss to explicitly supervise the slope of the FFR curve. To train and evaluate our model, eight-fold cross-validation was performed. The resulting FFR curves show good agreement with the reference allowing the distinction between diffuse and focal CAD distributions in most cases. Quantitative evaluation yielded a mean absolute difference in the area under the FFR pullback curve (AUPC) of 1.7. The method may pave the way towards fast, accurate, automatic prediction of FFR along the artery from CCTA.

IVNov 6, 2023
Predicting Age from White Matter Diffusivity with Residual Learning

Chenyu Gao, Michael E. Kim, Ho Hin Lee et al.

Imaging findings inconsistent with those expected at specific chronological age ranges may serve as early indicators of neurological disorders and increased mortality risk. Estimation of chronological age, and deviations from expected results, from structural MRI data has become an important task for developing biomarkers that are sensitive to such deviations. Complementary to structural analysis, diffusion tensor imaging (DTI) has proven effective in identifying age-related microstructural changes within the brain white matter, thereby presenting itself as a promising additional modality for brain age prediction. Although early studies have sought to harness DTI's advantages for age estimation, there is no evidence that the success of this prediction is owed to the unique microstructural and diffusivity features that DTI provides, rather than the macrostructural features that are also available in DTI data. Therefore, we seek to develop white-matter-specific age estimation to capture deviations from normal white matter aging. Specifically, we deliberately disregard the macrostructural information when predicting age from DTI scalar images, using two distinct methods. The first method relies on extracting only microstructural features from regions of interest. The second applies 3D residual neural networks (ResNets) to learn features directly from the images, which are non-linearly registered and warped to a template to minimize macrostructural variations. When tested on unseen data, the first method yields mean absolute error (MAE) of 6.11 years for cognitively normal participants and MAE of 6.62 years for cognitively impaired participants, while the second method achieves MAE of 4.69 years for cognitively normal participants and MAE of 4.96 years for cognitively impaired participants. We find that the ResNet model captures subtler, non-macrostructural features for brain age prediction.

IVOct 3, 2023
Robust deformable image registration using cycle-consistent implicit representations

Louis D. van Harten, Jaap Stoker, Ivana Išgum

Recent works in medical image registration have proposed the use of Implicit Neural Representations, demonstrating performance that rivals state-of-the-art learning-based methods. However, these implicit representations need to be optimized for each new image pair, which is a stochastic process that may fail to converge to a global minimum. To improve robustness, we propose a deformable registration method using pairs of cycle-consistent Implicit Neural Representations: each implicit representation is linked to a second implicit representation that estimates the opposite transformation, causing each network to act as a regularizer for its paired opposite. During inference, we generate multiple deformation estimates by numerically inverting the paired backward transformation and evaluating the consensus of the optimized pair. This consensus improves registration accuracy over using a single representation and results in a robust uncertainty metric that can be used for automatic quality control. We evaluate our method with a 4D lung CT dataset. The proposed cycle-consistent optimization method reduces the optimization failure rate from 2.4% to 0.0% compared to the current state-of-the-art. The proposed inference method improves landmark accuracy by 4.5% and the proposed uncertainty metric detects all instances where the registration method fails to converge to a correct solution. We verify the generalizability of these results to other data using a centerline propagation task in abdominal 4D MRI, where our method achieves a 46% improvement in propagation consistency compared with single-INR registration and demonstrates a strong correlation between the proposed uncertainty metric and registration accuracy.

CVFeb 12
Synthesis of Late Gadolinium Enhancement Images via Implicit Neural Representations for Cardiac Scar Segmentation

Soufiane Ben Haddou, Laura Alvarez-Florez, Erik J. Bekkers et al.

Late gadolinium enhancement (LGE) imaging is the clinical standard for myocardial scar assessment, but limited annotated datasets hinder the development of automated segmentation methods. We propose a novel framework that synthesises both LGE images and their corresponding segmentation masks using implicit neural representations (INRs) combined with denoising diffusion models. Our approach first trains INRs to capture continuous spatial representations of LGE data and associated myocardium and fibrosis masks. These INRs are then compressed into compact latent embeddings, preserving essential anatomical information. A diffusion model operates on this latent space to generate new representations, which are decoded into synthetic LGE images with anatomically consistent segmentation masks. Experiments on 133 cardiac MRI scans suggest that augmenting training data with 200 synthetic volumes contributes to improved fibrosis segmentation performance, with the Dice score showing an increase from 0.509 to 0.524. Our approach provides an annotation-free method to help mitigate data scarcity.The code for this research is publicly available.

IVFeb 13
Dual-Phase Cross-Modal Contrastive Learning for CMR-Guided ECG Representations for Cardiovascular Disease Assessment

Laura Alvarez-Florez, Angel Bujalance-Gomez, Femke Raijmakers et al.

Cardiac magnetic resonance imaging (CMR) offers detailed evaluation of cardiac structure and function, but its limited accessibility restricts use to selected patient populations. In contrast, the electrocardiogram (ECG) is ubiquitous and inexpensive, and provides rich information on cardiac electrical activity and rhythm, yet offers limited insight into underlying cardiac structure and mechanical function. To address this, we introduce a contrastive learning framework that improves the extraction of clinically relevant cardiac phenotypes from ECG by learning from paired ECG-CMR data. Our approach aligns ECG representations with 3D CMR volumes at end-diastole (ED) and end-systole (ES), with a dual-phase contrastive loss to anchor each ECG jointly with both cardiac phases in a shared latent space. Unlike prior methods limited to 2D CMR representations with or without a temporal component, our framework models 3D anatomy at both ED and ES phases as distinct latent representations, enabling flexible disentanglement of structural and functional cardiac properties. Using over 34,000 ECG-CMR pairs from the UK Biobank, we demonstrate improved extraction of image-derived phenotypes from ECG, particularly for functional parameters ($\uparrow$ 9.2\%), while improvements in clinical outcome prediction remained modest ($\uparrow$ 0.7\%). This strategy could enable scalable and cost-effective extraction of image-derived traits from ECG. The code for this research is publicly available.

CVFeb 5
Neural Implicit 3D Cardiac Shape Reconstruction from Sparse CT Angiography Slices Mimicking 2D Transthoracic Echocardiography Views

Gino E. Jansen, Carolina Brás, R. Nils Planken et al.

Accurate 3D representations of cardiac structures allow quantitative analysis of anatomy and function. In this work, we propose a method for reconstructing complete 3D cardiac shapes from segmentations of sparse planes in CT angiography (CTA) for application in 2D transthoracic echocardiography (TTE). Our method uses a neural implicit function to reconstruct the 3D shape of the cardiac chambers and left-ventricle myocardium from sparse CTA planes. To investigate the feasibility of achieving 3D reconstruction from 2D TTE, we select planes that mimic the standard apical 2D TTE views. During training, a multi-layer perceptron learns shape priors from 3D segmentations of the target structures in CTA. At test time, the network reconstructs 3D cardiac shapes from segmentations of TTE-mimicking CTA planes by jointly optimizing the latent code and the rigid transforms that map the observed planes into 3D space. For each heart, we simulate four realistic apical views, and we compare reconstructed multi-class volumes with the reference CTA volumes. On a held-out set of CTA segmentations, our approach achieves an average Dice coefficient of 0.86 $\pm$ 0.04 across all structures. Our method also achieves markedly lower volume errors than the clinical standard, Simpson's biplane rule: 4.88 $\pm$ 4.26 mL vs. 8.14 $\pm$ 6.04 mL, respectively, for the left ventricle; and 6.40 $\pm$ 7.37 mL vs. 37.76 $\pm$ 22.96 mL, respectively, for the left atrium. This suggests that our approach offers a viable route to more accurate 3D chamber quantification in 2D transthoracic echocardiography.

MED-PHApr 30, 2024Code
ATOMMIC: An Advanced Toolbox for Multitask Medical Imaging Consistency to facilitate Artificial Intelligence applications from acquisition to analysis in Magnetic Resonance Imaging

Dimitrios Karkalousos, Ivana Išgum, Henk A. Marquering et al.

AI is revolutionizing MRI along the acquisition and processing chain. Advanced AI frameworks have been developed to apply AI in various successive tasks, such as image reconstruction, quantitative parameter map estimation, and image segmentation. Existing frameworks are often designed to perform tasks independently or are focused on specific models or datasets, limiting generalization. We introduce ATOMMIC, an open-source toolbox that streamlines AI applications for accelerated MRI reconstruction and analysis. ATOMMIC implements several tasks using DL networks and enables MultiTask Learning (MTL) to perform related tasks integrated, targeting generalization in the MRI domain. We first review the current state of AI frameworks for MRI through a comprehensive literature search and by parsing 12,479 GitHub repositories. We benchmark 25 DL models on eight publicly available datasets to present distinct applications of ATOMMIC on accelerated MRI reconstruction, image segmentation, quantitative parameter map estimation, and joint accelerated MRI reconstruction and image segmentation utilizing MTL. Our findings demonstrate that ATOMMIC is the only MTL framework with harmonized complex-valued and real-valued data support. Evaluations on single tasks show that physics-based models, which enforce data consistency by leveraging the physical properties of MRI, outperform other models in reconstructing highly accelerated acquisitions. Physics-based models that produce high reconstruction quality can accurately estimate quantitative parameter maps. When high-performing reconstruction models are combined with robust segmentation networks utilizing MTL, performance is improved in both tasks. ATOMMIC facilitates MRI reconstruction and analysis by standardizing workflows, enhancing data interoperability, integrating unique features like MTL, and effectively benchmarking DL models.

CVOct 29, 2024
Brain age identification from diffusion MRI synergistically predicts neurodegenerative disease

Chenyu Gao, Michael E. Kim, Karthik Ramadass et al.

Estimated brain age from magnetic resonance image (MRI) and its deviation from chronological age can provide early insights into potential neurodegenerative diseases, supporting early detection and implementation of prevention strategies. Diffusion MRI (dMRI) presents an opportunity to build an earlier biomarker for neurodegenerative disease prediction because it captures subtle microstructural changes that precede more perceptible macrostructural changes. However, the coexistence of macro- and micro-structural information in dMRI raises the question of whether current dMRI-based brain age estimation models are leveraging the intended microstructural information or if they inadvertently rely on the macrostructural information. To develop a microstructure-specific brain age, we propose a method for brain age identification from dMRI that mitigates the model's use of macrostructural information by non-rigidly registering all images to a standard template. Imaging data from 13,398 participants across 12 datasets were used for the training and evaluation. We compare our brain age models, trained with and without macrostructural information mitigated, with an architecturally similar T1-weighted (T1w) MRI-based brain age model and two recent, popular, openly available T1w MRI-based brain age models that primarily use macrostructural information. We observe difference between our dMRI-based brain age and T1w MRI-based brain age across stages of neurodegeneration, with dMRI-based brain age being older than T1w MRI-based brain age in participants transitioning from cognitively normal (CN) to mild cognitive impairment (MCI), but younger in participants already diagnosed with Alzheimer's disease (AD). Furthermore, dMRI-based brain age may offer advantages over T1w MRI-based brain age in predicting the transition from CN to MCI up to five years before diagnosis.

CVMar 7, 2025
Attenuation artifact detection and severity classification in intracoronary OCT using mixed image representations

Pierandrea Cancian, Simone Saitta, Xiaojin Gu et al.

In intracoronary optical coherence tomography (OCT), blood residues and gas bubbles cause attenuation artifacts that can obscure critical vessel structures. The presence and severity of these artifacts may warrant re-acquisition, prolonging procedure time and increasing use of contrast agent. Accurate detection of these artifacts can guide targeted re-acquisition, reducing the amount of repeated scans needed to achieve diagnostically viable images. However, the highly heterogeneous appearance of these artifacts poses a challenge for the automated detection of the affected image regions. To enable automatic detection of the attenuation artifacts caused by blood residues and gas bubbles based on their severity, we propose a convolutional neural network that performs classification of the attenuation lines (A-lines) into three classes: no artifact, mild artifact and severe artifact. Our model extracts and merges features from OCT images in both Cartesian and polar coordinates, where each column of the image represents an A-line. Our method detects the presence of attenuation artifacts in OCT frames reaching F-scores of 0.77 and 0.94 for mild and severe artifacts, respectively. The inference time over a full OCT scan is approximately 6 seconds. Our experiments show that analysis of images represented in both Cartesian and polar coordinate systems outperforms the analysis in polar coordinates only, suggesting that these representations contain complementary features. This work lays the foundation for automated artifact assessment and image acquisition guidance in intracoronary OCT imaging.

CVFeb 11
Fighting MRI Anisotropy: Learning Multiple Cardiac Shapes From a Single Implicit Neural Representation

Carolina Brás, Soufiane Ben Haddou, Thijs P. Kuipers et al.

The anisotropic nature of short-axis (SAX) cardiovascular magnetic resonance imaging (CMRI) limits cardiac shape analysis. To address this, we propose to leverage near-isotropic, higher resolution computed tomography angiography (CTA) data of the heart. We use this data to train a single neural implicit function to jointly represent cardiac shapes from CMRI at any resolution. We evaluate the method for the reconstruction of right ventricle (RV) and myocardium (MYO), where MYO simultaneously models endocardial and epicardial left-ventricle surfaces. Since high-resolution SAX reference segmentations are unavailable, we evaluate performance by extracting a 4-chamber (4CH) slice of RV and MYO from their reconstructed shapes. When compared with the reference 4CH segmentation masks from CMRI, our method achieved a Dice similarity coefficient of 0.91 $\pm$ 0.07 and 0.75 $\pm$ 0.13, and a Hausdorff distance of 6.21 $\pm$ 3.97 mm and 7.53 $\pm$ 5.13 mm for RV and MYO, respectively. Quantitative and qualitative assessment demonstrate the model's ability to reconstruct accurate, smooth and anatomically plausible shapes, supporting improvements in cardiac shape analysis.

CVMar 15, 2025
Temporally Consistent Mitral Annulus Measurements from Sparse Annotations in Echocardiographic Videos

Gino E. Jansen, Mark J. Schuuring, Berto J. Bouma et al.

This work presents a novel approach to achieving temporally consistent mitral annulus landmark localization in echocardiography videos using sparse annotations. Our method introduces a self-supervised loss term that enforces temporal consistency between neighboring frames, which smooths the position of landmarks and enhances measurement accuracy over time. Additionally, we incorporate realistic field-of-view augmentations to improve the recognition of missing anatomical landmarks. We evaluate our approach on both a public and private dataset, and demonstrate significant improvements in Mitral Annular Plane Systolic Excursion (MAPSE) calculations and overall landmark tracking stability. The method achieves a mean absolute MAPSE error of 1.81 $\pm$ 0.14 mm, an annulus size error of 2.46 $\pm$ 0.31 mm, and a landmark localization error of 2.48 $\pm$ 0.07 mm. Finally, it achieves a 0.99 ROC-AUC for recognition of missing landmarks.

IVFeb 18, 2022
Autoencoding Low-Resolution MRI for Semantically Smooth Interpolation of Anisotropic MRI

Jörg Sander, Bob D. de Vos, Ivana Išgum

High-resolution medical images are beneficial for analysis but their acquisition may not always be feasible. Alternatively, high-resolution images can be created from low-resolution acquisitions using conventional upsampling methods, but such methods cannot exploit high-level contextual information contained in the images. Recently, better performing deep-learning based super-resolution methods have been introduced. However, these methods are limited by their supervised character, i.e. they require high-resolution examples for training. Instead, we propose an unsupervised deep learning semantic interpolation approach that synthesizes new intermediate slices from encoded low-resolution examples. To achieve semantically smooth interpolation in through-plane direction, the method exploits the latent space generated by autoencoders. To generate new intermediate slices, latent space encodings of two spatially adjacent slices are combined using their convex combination. Subsequently, the combined encoding is decoded to an intermediate slice. To constrain the model, a notion of semantic similarity is defined for a given dataset. For this, a new loss is introduced that exploits the spatial relationship between slices of the same volume. During training, an existing in-between slice is generated using a convex combination of its neighboring slice encodings. The method was trained and evaluated using publicly available cardiac cine, neonatal brain and adult brain MRI scans. In all evaluations, the new method produces significantly better results in terms of Structural Similarity Index Measure and Peak Signal-to-Noise Ratio (p< 0.001 using one-sided Wilcoxon signed-rank test) than a cubic B-spline interpolation approach. Given the unsupervised nature of the method, high-resolution training data is not required and hence, the method can be readily applied in clinical settings.

IVNov 13, 2020
Automatic segmentation with detection of local segmentation failures in cardiac MRI

Jörg Sander, Bob D. de Vos, Ivana Išgum

Segmentation of cardiac anatomical structures in cardiac magnetic resonance images (CMRI) is a prerequisite for automatic diagnosis and prognosis of cardiovascular diseases. To increase robustness and performance of segmentation methods this study combines automatic segmentation and assessment of segmentation uncertainty in CMRI to detect image regions containing local segmentation failures. Three state-of-the-art convolutional neural networks (CNN) were trained to automatically segment cardiac anatomical structures and obtain two measures of predictive uncertainty: entropy and a measure derived by MC-dropout. Thereafter, using the uncertainties another CNN was trained to detect local segmentation failures that potentially need correction by an expert. Finally, manual correction of the detected regions was simulated. Using publicly available CMR scans from the MICCAI 2017 ACDC challenge, the impact of CNN architecture and loss function for segmentation, and the uncertainty measure was investigated. Performance was evaluated using the Dice coefficient and 3D Hausdorff distance between manual and automatic segmentation. The experiments reveal that combining automatic segmentation with simulated manual correction of detected segmentation failures leads to statistically significant performance increase.

IVOct 25, 2020
Unsupervised Super-Resolution: Creating High-Resolution Medical Images from Low-Resolution Anisotropic Examples

Jörg Sander, Bob D. de Vos, Ivana Išgum

Although high resolution isotropic 3D medical images are desired in clinical practice, their acquisition is not always feasible. Instead, lower resolution images are upsampled to higher resolution using conventional interpolation methods. Sophisticated learning-based super-resolution approaches are frequently unavailable in clinical setting, because such methods require training with high-resolution isotropic examples. To address this issue, we propose a learning-based super-resolution approach that can be trained using solely anisotropic images, i.e. without high-resolution ground truth data. The method exploits the latent space, generated by autoencoders trained on anisotropic images, to increase spatial resolution in low-resolution images. The method was trained and evaluated using 100 publicly available cardiac cine MR scans from the Automated Cardiac Diagnosis Challenge (ACDC). The quantitative results show that the proposed method performs better than conventional interpolation methods. Furthermore, the qualitative results indicate that especially finer cardiac structures are synthesized with high quality. The method has the potential to be applied to other anatomies and modalities and can be easily applied to any 3D anisotropic medical image dataset.

IVOct 1, 2020
Deep Group-wise Variational Diffeomorphic Image Registration

Tycho F. A. van der Ouderaa, Ivana Išgum, Wouter B. Veldhuis et al.

Deep neural networks are increasingly used for pair-wise image registration. We propose to extend current learning-based image registration to allow simultaneous registration of multiple images. To achieve this, we build upon the pair-wise variational and diffeomorphic VoxelMorph approach and present a general mathematical framework that enables both registration of multiple images to their geodesic average and registration in which any of the available images can be used as a fixed image. In addition, we provide a likelihood based on normalized mutual information, a well-known image similarity metric in registration, between multiple images, and a prior that allows for explicit control over the viscous fluid energy to effectively regularize deformations. We trained and evaluated our approach using intra-patient registration of breast MRI and Thoracic 4DCT exams acquired over multiple time points. Comparison with Elastix and VoxelMorph demonstrates competitive quantitative performance of the proposed method in terms of image similarity and reference landmark distances at significantly faster registration.

IVAug 10, 2020
Deep Learning from Dual-Energy Information for Whole-Heart Segmentation in Dual-Energy and Single-Energy Non-Contrast-Enhanced Cardiac CT

Steffen Bruns, Jelmer M. Wolterink, Richard A. P. Takx et al.

Deep learning-based whole-heart segmentation in coronary CT angiography (CCTA) allows the extraction of quantitative imaging measures for cardiovascular risk prediction. Automatic extraction of these measures in patients undergoing only non-contrast-enhanced CT (NCCT) scanning would be valuable. In this work, we leverage information provided by a dual-layer detector CT scanner to obtain a reference standard in virtual non-contrast (VNC) CT images mimicking NCCT images, and train a 3D convolutional neural network (CNN) for the segmentation of VNC as well as NCCT images. Contrast-enhanced acquisitions on a dual-layer detector CT scanner were reconstructed into a CCTA and a perfectly aligned VNC image. In each CCTA image, manual reference segmentations of the left ventricular (LV) myocardium, LV cavity, right ventricle, left atrium, right atrium, ascending aorta, and pulmonary artery trunk were obtained and propagated to the corresponding VNC image. These VNC images and reference segmentations were used to train 3D CNNs for automatic segmentation in either VNC images or NCCT images. Automatic segmentations in VNC images showed good agreement with reference segmentations, with an average Dice similarity coefficient of 0.897 \pm 0.034 and an average symmetric surface distance of 1.42 \pm 0.45 mm. Volume differences [95% confidence interval] between automatic NCCT and reference CCTA segmentations were -19 [-67; 30] mL for LV myocardium, -25 [-78; 29] mL for LV cavity, -29 [-73; 14] mL for right ventricle, -20 [-62; 21] mL for left atrium, and -19 [-73; 34] mL for right atrium, respectively. In 214 (74%) NCCT images from an independent multi-vendor multi-center set, two observers agreed that the automatic segmentation was mostly accurate or better. This method might enable quantification of additional cardiac measures from NCCT images for improved cardiovascular risk prediction.

IVJul 10, 2020
Deep Learning-Based Regression and Classification for Automatic Landmark Localization in Medical Images

Julia M. H. Noothout, Bob D. de Vos, Jelmer M. Wolterink et al.

In this study, we propose a fast and accurate method to automatically localize anatomical landmarks in medical images. We employ a global-to-local localization approach using fully convolutional neural networks (FCNNs). First, a global FCNN localizes multiple landmarks through the analysis of image patches, performing regression and classification simultaneously. In regression, displacement vectors pointing from the center of image patches towards landmark locations are determined. In classification, presence of landmarks of interest in the patch is established. Global landmark locations are obtained by averaging the predicted displacement vectors, where the contribution of each displacement vector is weighted by the posterior classification probability of the patch that it is pointing from. Subsequently, for each landmark localized with global localization, local analysis is performed. Specialized FCNNs refine the global landmark locations by analyzing local sub-images in a similar manner, i.e. by performing regression and classification simultaneously and combining the results. Evaluation was performed through localization of 8 anatomical landmarks in CCTA scans, 2 landmarks in olfactory MR scans, and 19 landmarks in cephalometric X-rays. We demonstrate that the method performs similarly to a second observer and is able to localize landmarks in a diverse set of medical images, differing in image modality, image dimensionality, and anatomical coverage.

IVNov 12, 2019
Automatic Online Quality Control of Synthetic CTs

Louis D. van Harten, Jelmer M. Wolterink, Joost J. C. Verhoeff et al.

Accurate MR-to-CT synthesis is a requirement for MR-only workflows in radiotherapy (RT) treatment planning. In recent years, deep learning-based approaches have shown impressive results in this field. However, to prevent downstream errors in RT treatment planning, it is important that deep learning models are only applied to data for which they are trained and that generated synthetic CT (sCT) images do not contain severe errors. For this, a mechanism for online quality control should be in place. In this work, we use an ensemble of sCT generators and assess their disagreement as a measure of uncertainty of the results. We show that this uncertainty measure can be used for two kinds of online quality control. First, to detect input images that are outside the expected distribution of MR images. Second, to identify sCT images that were generated from suitable MR images but potentially contain errors. Such automatic online quality control for sCT generation is likely to become an integral part of MR-only RT workflows.

LGNov 12, 2019
Exploiting Clinically Available Delineations for CNN-based Segmentation in Radiotherapy Treatment Planning

Louis D. van Harten, Jelmer M. Wolterink, Joost J. C. Verhoeff et al.

Convolutional neural networks (CNNs) have been widely and successfully used for medical image segmentation. However, CNNs are typically considered to require large numbers of dedicated expert-segmented training volumes, which may be limiting in practice. This work investigates whether clinically obtained segmentations which are readily available in picture archiving and communication systems (PACS) could provide a possible source of data to train a CNN for segmentation of organs-at-risk (OARs) in radiotherapy treatment planning. In such data, delineations of structures deemed irrelevant to the target clinical use may be lacking. To overcome this issue, we use multi-label instead of multi-class segmentation. We empirically assess how many clinical delineations would be sufficient to train a CNN for the segmentation of OARs and find that increasing the training set size beyond a limited number of images leads to sharply diminishing returns. Moreover, we find that by using multi-label segmentation, missing structures in the reference standard do not have a negative effect on overall segmentation accuracy. These results indicate that segmentations obtained in a clinical workflow can be used to train an accurate OAR segmentation model.

IVNov 10, 2019
Combined analysis of coronary arteries and the left ventricular myocardium in cardiac CT angiography for detection of patients with functionally significant stenosis

Majd Zreik, Tim Leiner, Nadieh Khalili et al.

Treatment of patients with obstructive coronary artery disease is guided by the functional significance of a coronary artery stenosis. Fractional flow reserve (FFR), measured during invasive coronary angiography (ICA), is considered the gold standard to define the functional significance of a coronary stenosis. Here, we present a method for non-invasive detection of patients with functionally significant coronary artery stenosis, combining analysis of the coronary artery tree and the left ventricular (LV) myocardium in cardiac CT angiography (CCTA) images. We retrospectively collected CCTA scans of 126 patients who underwent invasive FFR measurements, to determine the functional significance of coronary stenoses. We combine our previous works for the analysis of the complete coronary artery tree and the LV myocardium: Coronary arteries are encoded by two disjoint convolutional autoencoders (CAEs) and the LV myocardium is characterized by a convolutional neural network (CNN) and a CAE. Thereafter, using the extracted encodings of all coronary arteries and the LV myocardium, patients are classified according to the presence of functionally significant stenosis, as defined by the invasively measured FFR. To handle the varying number of coronary arteries in a patient, the classification is formulated as a multiple instance learning problem and is performed using an attention-based neural network. Cross-validation experiments resulted in an average area under the receiver operating characteristic curve of $0.74 \pm 0.01$, and showed that the proposed combined analysis outperformed the analysis of the coronary arteries or the LV myocardium only. The results demonstrate the feasibility of combining the analyses of the complete coronary artery tree and the LV myocardium in CCTA images for the detection of patients with functionally significant stenosis in coronary arteries.

IVAug 14, 2019
Graph Convolutional Networks for Coronary Artery Segmentation in Cardiac CT Angiography

Jelmer M. Wolterink, Tim Leiner, Ivana Išgum

Detection of coronary artery stenosis in coronary CT angiography (CCTA) requires highly personalized surface meshes enclosing the coronary lumen. In this work, we propose to use graph convolutional networks (GCNs) to predict the spatial location of vertices in a tubular surface mesh that segments the coronary artery lumen. Predictions for individual vertex locations are based on local image features as well as on features of neighboring vertices in the mesh graph. The method was trained and evaluated using the publicly available Coronary Artery Stenoses Detection and Quantification Evaluation Framework. Surface meshes enclosing the full coronary artery tree were automatically extracted. A quantitative evaluation on 78 coronary artery segments showed that these meshes corresponded closely to reference annotations, with a Dice similarity coefficient of 0.75/0.73, a mean surface distance of 0.25/0.28 mm, and a Hausdorff distance of 1.53/1.86 mm in healthy/diseased vessel segments. The results showed that inclusion of mesh information in a GCN improves segmentation overlap and accuracy over a baseline model without interaction on the mesh. The results indicate that GCNs allow efficient extraction of coronary artery surface meshes and that the use of GCNs leads to regular and more accurate meshes.

IVJun 28, 2019
Adversarial optimization for joint registration and segmentation in prostate CT radiotherapy

Mohamed S. Elmahdy, Jelmer M. Wolterink, Hessam Sokooti et al.

Joint image registration and segmentation has long been an active area of research in medical imaging. Here, we reformulate this problem in a deep learning setting using adversarial learning. We consider the case in which fixed and moving images as well as their segmentations are available for training, while segmentations are not available during testing; a common scenario in radiotherapy. The proposed framework consists of a 3D end-to-end generator network that estimates the deformation vector field (DVF) between fixed and moving images in an unsupervised fashion and applies this DVF to the moving image and its segmentation. A discriminator network is trained to evaluate how well the moving image and segmentation align with the fixed image and segmentation. The proposed network was trained and evaluated on follow-up prostate CT scans for image-guided radiotherapy, where the planning CT contours are propagated to the daily CT images using the estimated DVF. A quantitative comparison with conventional registration using \texttt{elastix} showed that the proposed method improved performance and substantially reduced computation time, thus enabling real-time contour propagation necessary for online-adaptive radiotherapy.

IVJun 11, 2019
Deep learning analysis of coronary arteries in cardiac CT angiography for detection of patients requiring invasive coronary angiography

Majd Zreik, Robbert W. van Hamersvelt, Nadieh Khalili et al.

In patients with obstructive coronary artery disease, the functional significance of a coronary artery stenosis needs to be determined to guide treatment. This is typically established through fractional flow reserve (FFR) measurement, performed during invasive coronary angiography (ICA). We present a method for automatic and non-invasive detection of patients requiring ICA, employing deep unsupervised analysis of complete coronary arteries in cardiac CT angiography (CCTA) images. We retrospectively collected CCTA scans of 187 patients, 137 of them underwent invasive FFR measurement in 192 different coronary arteries. These FFR measurements served as a reference standard for the functional significance of the coronary stenosis. The centerlines of the coronary arteries were extracted and used to reconstruct straightened multi-planar reformatted (MPR) volumes. To automatically identify arteries with functionally significant stenosis that require ICA, each MPR volume was encoded into a fixed number of encodings using two disjoint 3D and 1D convolutional autoencoders performing spatial and sequential encodings, respectively. Thereafter, these encodings were employed to classify arteries using a support vector machine classifier. The detection of coronary arteries requiring invasive evaluation, evaluated using repeated cross-validation experiments, resulted in an area under the receiver operating characteristic curve of $0.81 \pm 0.02$ on the artery-level, and $0.87 \pm 0.02$ on the patient-level. The results demonstrate the feasibility of automatic non-invasive detection of patients that require ICA and possibly subsequent coronary artery intervention. This could potentially reduce the number of patients that unnecessarily undergo ICA.

CVOct 24, 2018
Generative adversarial networks and adversarial methods in biomedical image analysis

Jelmer M. Wolterink, Konstantinos Kamnitsas, Christian Ledig et al.

Generative adversarial networks (GANs) and other adversarial methods are based on a game-theoretical perspective on joint optimization of two neural networks as players in a game. Adversarial techniques have been extensively used to synthesize and analyze biomedical images. We provide an introduction to GANs and adversarial methods, with an overview of biomedical image analysis tasks that have benefited from such methods. We conclude with a discussion of strengths and limitations of adversarial methods in biomedical image analysis, and propose potential future research directions.

CVOct 7, 2018
Coronary Artery Centerline Extraction in Cardiac CT Angiography Using a CNN-Based Orientation Classifier

Jelmer M. Wolterink, Robbert W. van Hamersvelt, Max A. Viergever et al.

Coronary artery centerline extraction in cardiac CT angiography (CCTA) images is a prerequisite for evaluation of stenoses and atherosclerotic plaque. We propose an algorithm that extracts coronary artery centerlines in CCTA using a convolutional neural network (CNN). A 3D dilated CNN is trained to predict the most likely direction and radius of an artery at any given point in a CCTA image based on a local image patch. Starting from a single seed point placed manually or automatically anywhere in a coronary artery, a tracker follows the vessel centerline in two directions using the predictions of the CNN. Tracking is terminated when no direction can be identified with high certainty. The CNN was trained using 32 manually annotated centerlines in a training set consisting of 8 CCTA images provided in the MICCAI 2008 Coronary Artery Tracking Challenge (CAT08). Evaluation using 24 test images of the CAT08 challenge showed that extracted centerlines had an average overlap of 93.7% with 96 manually annotated reference centerlines. Extracted centerline points were highly accurate, with an average distance of 0.21 mm to reference centerline points. In a second test set consisting of 50 CCTA scans, 5,448 markers in the coronary arteries were used as seed points to extract single centerlines. This showed strong correspondence between extracted centerlines and manually placed markers. In a third test set containing 36 CCTA scans, fully automatic seeding and centerline extraction led to extraction of on average 92% of clinically relevant coronary artery segments. The proposed method is able to accurately and efficiently determine the direction and radius of coronary arteries. The method can be trained with limited training data, and once trained allows fast automatic or interactive extraction of coronary artery trees from CCTA images.

CVOct 4, 2018
Direct Prediction of Cardiovascular Mortality from Low-dose Chest CT using Deep Learning

Sanne G. M. van Velzen, Majd Zreik, Nikolas Lessmann et al.

Cardiovascular disease (CVD) is a leading cause of death in the lung cancer screening population. Chest CT scans made in lung cancer screening are suitable for identification of participants at risk of CVD. Existing methods analyzing CT images from lung cancer screening for prediction of CVD events or mortality use engineered features extracted from the images combined with patient information. In this work we propose a method that automatically predicts 5-year cardiovascular mortality directly from chest CT scans without the need for hand-crafting image features. A set of 1,583 participants of the National Lung Screening Trial was included (1,188 survivors, 395 non-survivors). Low-dose chest CT images acquired at baseline were analyzed and the follow-up time was 5 years. To limit the analysis to the heart region, the heart was first localized by our previously developed algorithm for organ localization exploiting convolutional neural networks. Thereafter, a convolutional autoencoder was used to encode the identified heart region. Finally, based on the extracted encodings subjects were classified into survivors or non-survivors using a support vector machine classifier. The performance of the method was assessed in eight cross-validation experiments with 1,433 images used for training, 50 for validation and 100 for testing. The method achieved a performance with an area under the ROC curve of 0.72. The results demonstrate that prediction of cardiovascular mortality directly from low-dose screening chest CT scans, without hand-crafted features, is feasible, allowing identification of subjects at risk of fatal CVD events.

CVSep 27, 2018
Improving Myocardium Segmentation in Cardiac CT Angiography using Spectral Information

Steffen Bruns, Jelmer M. Wolterink, Robbert W. van Hamersvelt et al.

Accurate segmentation of the left ventricle myocardium in cardiac CT angiography (CCTA) is essential for e.g. the assessment of myocardial perfusion. Automatic deep learning methods for segmentation in CCTA might suffer from differences in contrast-agent attenuation between training and test data due to non-standardized contrast administration protocols and varying cardiac output. We propose augmentation of the training data with virtual mono-energetic reconstructions from a spectral CT scanner which show different attenuation levels of the contrast agent. We compare this to an augmentation by linear scaling of all intensity values, and combine both types of augmentation. We train a 3D fully convolutional network (FCN) with 10 conventional CCTA images and corresponding virtual mono-energetic reconstructions acquired on a spectral CT scanner, and evaluate on 40 CCTA scans acquired on a conventional CT scanner. We show that training with data augmentation using virtual mono-energetic images improves upon training with only conventional images (Dice similarity coefficient (DSC) 0.895 $\pm$ 0.039 vs. 0.846 $\pm$ 0.125). In comparison, training with data augmentation using linear scaling improves the DSC to 0.890 $\pm$ 0.039. Moreover, combining the results of both augmentation methods leads to a DSC of 0.901 $\pm$ 0.036, showing that both augmentations lead to different local improvements of the segmentations. Our results indicate that virtual mono-energetic images improve the generalization of an FCN used for myocardium segmentation in CCTA images.

CVSep 27, 2018
Towards increased trustworthiness of deep learning segmentation methods on cardiac MRI

Jörg Sander, Bob D. de Vos, Jelmer M. Wolterink et al.

Current state-of-the-art deep learning segmentation methods have not yet made a broad entrance into the clinical setting in spite of high demand for such automatic methods. One important reason is the lack of reliability caused by models that fail unnoticed and often locally produce anatomically implausible results that medical experts would not make. This paper presents an automatic image segmentation method based on (Bayesian) dilated convolutional networks (DCNN) that generate segmentation masks and spatial uncertainty maps for the input image at hand. The method was trained and evaluated using segmentation of the left ventricle (LV) cavity, right ventricle (RV) endocardium and myocardium (Myo) at end-diastole (ED) and end-systole (ES) in 100 cardiac 2D MR scans from the MICCAI 2017 Challenge (ACDC). Combining segmentations and uncertainty maps and employing a human-in-the-loop setting, we provide evidence that image areas indicated as highly uncertain regarding the obtained segmentation almost entirely cover regions of incorrect segmentations. The fused information can be harnessed to increase segmentation performance. Our results reveal that we can obtain valuable spatial uncertainty maps with low computational effort using DCNNs.

CVApr 13, 2018
CNN-based Landmark Detection in Cardiac CTA Scans

Julia M. H. Noothout, Bob D. de Vos, Jelmer M. Wolterink et al.

Fast and accurate anatomical landmark detection can benefit many medical image analysis methods. Here, we propose a method to automatically detect anatomical landmarks in medical images. Automatic landmark detection is performed with a patch-based fully convolutional neural network (FCNN) that combines regression and classification. For any given image patch, regression is used to predict the 3D displacement vector from the image patch to the landmark. Simultaneously, classification is used to identify patches that contain the landmark. Under the assumption that patches close to a landmark can determine the landmark location more precisely than patches farther from it, only those patches that contain the landmark according to classification are used to determine the landmark location. The landmark location is obtained by calculating the average landmark location using the computed 3D displacement vectors. The method is evaluated using detection of six clinically relevant landmarks in coronary CT angiography (CCTA) scans: the right and left ostium, the bifurcation of the left main coronary artery (LM) into the left anterior descending and the left circumflex artery, and the origin of the right, non-coronary, and left aortic valve commissure. The proposed method achieved an average Euclidean distance error of 2.19 mm and 2.88 mm for the right and left ostium respectively, 3.78 mm for the bifurcation of the LM, and 1.82 mm, 2.10 mm and 1.89 mm for the origin of the right, non-coronary, and left aortic valve commissure respectively, demonstrating accurate performance. The proposed combination of regression and classification can be used to accurately detect landmarks in CCTA scans.

CVApr 12, 2018
Iterative fully convolutional neural networks for automatic vertebra segmentation and identification

Nikolas Lessmann, Bram van Ginneken, Pim A. de Jong et al.

Precise segmentation and anatomical identification of the vertebrae provides the basis for automatic analysis of the spine, such as detection of vertebral compression fractures or other abnormalities. Most dedicated spine CT and MR scans as well as scans of the chest, abdomen or neck cover only part of the spine. Segmentation and identification should therefore not rely on the visibility of certain vertebrae or a certain number of vertebrae. We propose an iterative instance segmentation approach that uses a fully convolutional neural network to segment and label vertebrae one after the other, independently of the number of visible vertebrae. This instance-by-instance segmentation is enabled by combining the network with a memory component that retains information about already segmented vertebrae. The network iteratively analyzes image patches, using information from both image and memory to search for the next vertebra. To efficiently traverse the image, we include the prior knowledge that the vertebrae are always located next to each other, which is used to follow the vertebral column. This method was evaluated with five diverse datasets, including multiple modalities (CT and MR), various fields of view and coverages of different sections of the spine, and a particularly challenging set of low-dose chest CT scans. The proposed iterative segmentation method compares favorably with state-of-the-art methods and is fast, flexible and generalizable.

CVNov 24, 2017
Deep learning analysis of the myocardium in coronary CT angiography for identification of patients with functionally significant coronary artery stenosis

Majd Zreik, Nikolas Lessmann, Robbert W. van Hamersvelt et al.

In patients with coronary artery stenoses of intermediate severity, the functional significance needs to be determined. Fractional flow reserve (FFR) measurement, performed during invasive coronary angiography (ICA), is most often used in clinical practice. To reduce the number of ICA procedures, we present a method for automatic identification of patients with functionally significant coronary artery stenoses, employing deep learning analysis of the left ventricle (LV) myocardium in rest coronary CT angiography (CCTA). The study includes consecutively acquired CCTA scans of 166 patients with FFR measurements. To identify patients with a functionally significant coronary artery stenosis, analysis is performed in several stages. First, the LV myocardium is segmented using a multiscale convolutional neural network (CNN). To characterize the segmented LV myocardium, it is subsequently encoded using unsupervised convolutional autoencoder (CAE). Thereafter, patients are classified according to the presence of functionally significant stenosis using an SVM classifier based on the extracted and clustered encodings. Quantitative evaluation of LV myocardium segmentation in 20 images resulted in an average Dice coefficient of 0.91 and an average mean absolute distance between the segmented and reference LV boundaries of 0.7 mm. Classification of patients was evaluated in the remaining 126 CCTA scans in 50 10-fold cross-validation experiments and resulted in an area under the receiver operating characteristic curve of 0.74 +- 0.02. At sensitivity levels 0.60, 0.70 and 0.80, the corresponding specificity was 0.77, 0.71 and 0.59, respectively. The results demonstrate that automatic analysis of the LV myocardium in a single CCTA scan acquired at rest, without assessment of the anatomy of the coronary arteries, can be used to identify patients with functionally significant coronary artery stenosis.

CVNov 1, 2017
Automatic calcium scoring in low-dose chest CT using deep neural networks with dilated convolutions

Nikolas Lessmann, Bram van Ginneken, Majd Zreik et al.

Heavy smokers undergoing screening with low-dose chest CT are affected by cardiovascular disease as much as by lung cancer. Low-dose chest CT scans acquired in screening enable quantification of atherosclerotic calcifications and thus enable identification of subjects at increased cardiovascular risk. This paper presents a method for automatic detection of coronary artery, thoracic aorta and cardiac valve calcifications in low-dose chest CT using two consecutive convolutional neural networks. The first network identifies and labels potential calcifications according to their anatomical location and the second network identifies true calcifications among the detected candidates. This method was trained and evaluated on a set of 1744 CT scans from the National Lung Screening Trial. To determine whether any reconstruction or only images reconstructed with soft tissue filters can be used for calcification detection, we evaluated the method on soft and medium/sharp filter reconstructions separately. On soft filter reconstructions, the method achieved F1 scores of 0.89, 0.89, 0.67, and 0.55 for coronary artery, thoracic aorta, aortic valve and mitral valve calcifications, respectively. On sharp filter reconstructions, the F1 scores were 0.84, 0.81, 0.64, and 0.66, respectively. Linearly weighted kappa coefficients for risk category assignment based on per subject coronary artery calcium were 0.91 and 0.90 for soft and sharp filter reconstructions, respectively. These results demonstrate that the presented method enables reliable automatic cardiovascular risk assessment in all low-dose chest CT scans acquired for lung cancer screening.

CVApr 20, 2017
End-to-End Unsupervised Deformable Image Registration with a Convolutional Neural Network

Bob D. de Vos, Floris F. Berendsen, Max A. Viergever et al.

In this work we propose a deep learning network for deformable image registration (DIRNet). The DIRNet consists of a convolutional neural network (ConvNet) regressor, a spatial transformer, and a resampler. The ConvNet analyzes a pair of fixed and moving images and outputs parameters for the spatial transformer, which generates the displacement vector field that enables the resampler to warp the moving image to the fixed image. The DIRNet is trained end-to-end by unsupervised optimization of a similarity metric between input image pairs. A trained DIRNet can be applied to perform registration on unseen image pairs in one pass, thus non-iteratively. Evaluation was performed with registration of images of handwritten digits (MNIST) and cardiac cine MR scans (Sunnybrook Cardiac Data). The results demonstrate that registration with DIRNet is as accurate as a conventional deformable image registration method with substantially shorter execution times.

CVApr 19, 2017
ConvNet-Based Localization of Anatomical Structures in 3D Medical Images

Bob D. de Vos, Jelmer M. Wolterink, Pim A. de Jong et al.

Localization of anatomical structures is a prerequisite for many tasks in medical image analysis. We propose a method for automatic localization of one or more anatomical structures in 3D medical images through detection of their presence in 2D image slices using a convolutional neural network (ConvNet). A single ConvNet is trained to detect presence of the anatomical structure of interest in axial, coronal, and sagittal slices extracted from a 3D image. To allow the ConvNet to analyze slices of different sizes, spatial pyramid pooling is applied. After detection, 3D bounding boxes are created by combining the output of the ConvNet in all slices. In the experiments 200 chest CT, 100 cardiac CT angiography (CTA), and 100 abdomen CT scans were used. The heart, ascending aorta, aortic arch, and descending aorta were localized in chest CT scans, the left cardiac ventricle in cardiac CTA scans, and the liver in abdomen CT scans. Localization was evaluated using the distances between automatically and manually defined reference bounding box centroids and walls. The best results were achieved in localization of structures with clearly defined boundaries (e.g. aortic arch) and the worst when the structure boundary was not clearly visible (e.g. liver). The method was more robust and accurate in localization multiple structures.

CVApr 12, 2017
Dilated Convolutional Neural Networks for Cardiovascular MR Segmentation in Congenital Heart Disease

Jelmer M. Wolterink, Tim Leiner, Max A. Viergever et al.

We propose an automatic method using dilated convolutional neural networks (CNNs) for segmentation of the myocardium and blood pool in cardiovascular MR (CMR) of patients with congenital heart disease (CHD). Ten training and ten test CMR scans cropped to an ROI around the heart were provided in the MICCAI 2016 HVSMR challenge. A dilated CNN with a receptive field of 131x131 voxels was trained for myocardium and blood pool segmentation in axial, sagittal and coronal image slices. Performance was evaluated within the HVSMR challenge. Automatic segmentation of the test scans resulted in Dice indices of 0.80$\pm$0.06 and 0.93$\pm$0.02, average distances to boundaries of 0.96$\pm$0.31 and 0.89$\pm$0.24 mm, and Hausdorff distances of 6.13$\pm$3.76 and 7.07$\pm$3.01 mm for the myocardium and blood pool, respectively. Segmentation took 41.5$\pm$14.7 s per scan. In conclusion, dilated CNNs trained on a small set of CMR images of CHD patients showing large anatomical variability provide accurate myocardium and blood pool segmentations.

CVApr 11, 2017
Deep Learning for Multi-Task Medical Image Segmentation in Multiple Modalities

Pim Moeskops, Jelmer M. Wolterink, Bas H. M. van der Velden et al.

Automatic segmentation of medical images is an important task for many clinical applications. In practice, a wide range of anatomical structures are visualised using different imaging modalities. In this paper, we investigate whether a single convolutional neural network (CNN) can be trained to perform different segmentation tasks. A single CNN is trained to segment six tissues in MR brain images, the pectoral muscle in MR breast images, and the coronary arteries in cardiac CTA. The CNN therefore learns to identify the imaging modality, the visualised anatomical structures, and the tissue classes. For each of the three tasks (brain MRI, breast MRI and cardiac CTA), this combined training procedure resulted in a segmentation performance equivalent to that of a CNN trained specifically for that task, demonstrating the high capacity of CNN architectures. Hence, a single system could be used in clinical practice to automatically perform diverse segmentation tasks without task-specific training.

CVApr 11, 2017
Automatic segmentation of MR brain images with a convolutional neural network

Pim Moeskops, Max A. Viergever, Adriënne M. Mendrik et al.

Automatic segmentation in MR brain images is important for quantitative analysis in large-scale studies with images acquired at all ages. This paper presents a method for the automatic segmentation of MR brain images into a number of tissue classes using a convolutional neural network. To ensure that the method obtains accurate segmentation details as well as spatial consistency, the network uses multiple patch sizes and multiple convolution kernel sizes to acquire multi-scale information about each voxel. The method is not dependent on explicit features, but learns to recognise the information that is important for the classification based on training data. The method requires a single anatomical MR image only. The segmentation method is applied to five different data sets: coronal T2-weighted images of preterm infants acquired at 30 weeks postmenstrual age (PMA) and 40 weeks PMA, axial T2- weighted images of preterm infants acquired at 40 weeks PMA, axial T1-weighted images of ageing adults acquired at an average age of 70 years, and T1-weighted images of young adults acquired at an average age of 23 years. The method obtained the following average Dice coefficients over all segmented tissue classes for each data set, respectively: 0.87, 0.82, 0.84, 0.86 and 0.91. The results demonstrate that the method obtains accurate segmentations in all five sets, and hence demonstrates its robustness to differences in age and acquisition protocol.