Giacomo Tarroni

CV
h-index4
28papers
1,961citations
Novelty49%
AI Score44

28 Papers

CVJul 9, 2024
Ensembled Cold-Diffusion Restorations for Unsupervised Anomaly Detection

Sergio Naval Marimont, Vasilis Siomos, Matthew Baugh et al.

Unsupervised Anomaly Detection (UAD) methods aim to identify anomalies in test samples comparing them with a normative distribution learned from a dataset known to be anomaly-free. Approaches based on generative models offer interpretability by generating anomaly-free versions of test images, but are typically unable to identify subtle anomalies. Alternatively, approaches using feature modelling or self-supervised methods, such as the ones relying on synthetically generated anomalies, do not provide out-of-the-box interpretability. In this work, we present a novel method that combines the strengths of both strategies: a generative cold-diffusion pipeline (i.e., a diffusion-like pipeline which uses corruptions not based on noise) that is trained with the objective of turning synthetically-corrupted images back to their normal, original appearance. To support our pipeline we introduce a novel synthetic anomaly generation procedure, called DAG, and a novel anomaly score which ensembles restorations conditioned with different degrees of abnormality. Our method surpasses the prior state-of-the art for unsupervised anomaly detection in three different Brain MRI datasets.

CVNov 26, 2023
DISYRE: Diffusion-Inspired SYnthetic REstoration for Unsupervised Anomaly Detection

Sergio Naval Marimont, Matthew Baugh, Vasilis Siomos et al.

Unsupervised Anomaly Detection (UAD) techniques aim to identify and localize anomalies without relying on annotations, only leveraging a model trained on a dataset known to be free of anomalies. Diffusion models learn to modify inputs $x$ to increase the probability of it belonging to a desired distribution, i.e., they model the score function $\nabla_x \log p(x)$. Such a score function is potentially relevant for UAD, since $\nabla_x \log p(x)$ is itself a pixel-wise anomaly score. However, diffusion models are trained to invert a corruption process based on Gaussian noise and the learned score function is unlikely to generalize to medical anomalies. This work addresses the problem of how to learn a score function relevant for UAD and proposes DISYRE: Diffusion-Inspired SYnthetic REstoration. We retain the diffusion-like pipeline but replace the Gaussian noise corruption with a gradual, synthetic anomaly corruption so the learned score function generalizes to medical, naturally occurring anomalies. We evaluate DISYRE on three common Brain MRI UAD benchmarks and substantially outperform other methods in two out of the three tasks.

CVAug 2, 2023
Achieving state-of-the-art performance in the Medical Out-of-Distribution (MOOD) challenge using plausible synthetic anomalies

Sergio Naval Marimont, Giacomo Tarroni

The detection and localization of anomalies is one important medical image analysis task. Most commonly, Computer Vision anomaly detection approaches rely on manual annotations that are both time consuming and expensive to obtain. Unsupervised anomaly detection, or Out-of-Distribution detection, aims at identifying anomalous samples relying only on unannotated samples considered normal. In this study we present a new unsupervised anomaly detection method. Our method builds upon the self-supervised strategy consisting on training a segmentation network to identify local synthetic anomalies. Our contributions improve the synthetic anomaly generation process, making synthetic anomalies more heterogeneous and challenging by 1) using complex random shapes and 2) smoothing the edges of synthetic anomalies so networks cannot rely on the high gradient between image and synthetic anomalies. In our implementation we adopted standard practices in 3D medical image segmentation, including 3D U-Net architecture, patch-wise training and model ensembling. Our method was evaluated using a validation set with different types of synthetic anomalies. Our experiments show that our method improved substantially the baseline method performance. Additionally, we evaluated our method by participating in the Medical Out-of-Distribution (MOOD) Challenge held at MICCAI in 2022 and achieved first position in both sample-wise and pixel-wise tasks. Our experiments and results in the latest MOOD challenge show that our simple yet effective approach can substantially improve the performance of Out-of-Distribution detection techniques which rely on synthetic anomalies.

IVJun 30, 2022
Implicit U-Net for volumetric medical image segmentation

Sergio Naval Marimont, Giacomo Tarroni

U-Net has been the go-to architecture for medical image segmentation tasks, however computational challenges arise when extending the U-Net architecture to 3D images. We propose the Implicit U-Net architecture that adapts the efficient Implicit Representation paradigm to supervised image segmentation tasks. By combining a convolutional feature extractor with an implicit localization network, our implicit U-Net has 40% less parameters than the equivalent U-Net. Moreover, we propose training and inference procedures to capitalize sparse predictions. When comparing to an equivalent fully convolutional U-Net, Implicit U-Net reduces by approximately 30% inference and training time as well as training memory footprint while achieving comparable results in our experiments with two different abdominal CT scan datasets.

CVJul 27, 2023
MIM-OOD: Generative Masked Image Modelling for Out-of-Distribution Detection in Medical Images

Sergio Naval Marimont, Vasilis Siomos, Giacomo Tarroni

Unsupervised Out-of-Distribution (OOD) detection consists in identifying anomalous regions in images leveraging only models trained on images of healthy anatomy. An established approach is to tokenize images and model the distribution of tokens with Auto-Regressive (AR) models. AR models are used to 1) identify anomalous tokens and 2) in-paint anomalous representations with in-distribution tokens. However, AR models are slow at inference time and prone to error accumulation issues which negatively affect OOD detection performance. Our novel method, MIM-OOD, overcomes both speed and error accumulation issues by replacing the AR model with two task-specific networks: 1) a transformer optimized to identify anomalous tokens and 2) a transformer optimized to in-paint anomalous tokens using masked image modelling (MIM). Our experiments with brain MRI anomalies show that MIM-OOD substantially outperforms AR models (DICE 0.458 vs 0.301) while achieving a nearly 25x speedup (9.5s vs 244s).

CVNov 24, 2023
ARIA: On the Interaction Between Architectures, Initialization and Aggregation Methods for Federated Visual Classification

Vasilis Siomos, Sergio Naval-Marimont, Jonathan Passerat-Palmbach et al.

Federated Learning (FL) is a collaborative training paradigm that allows for privacy-preserving learning of cross-institutional models by eliminating the exchange of sensitive data and instead relying on the exchange of model parameters between the clients and a server. Despite individual studies on how client models are aggregated, and, more recently, on the benefits of ImageNet pre-training, there is a lack of understanding of the effect the architecture chosen for the federation has, and of how the aforementioned elements interconnect. To this end, we conduct the first joint ARchitecture-Initialization-Aggregation study and benchmark ARIAs across a range of medical image classification tasks. We find that, contrary to current practices, ARIA elements have to be chosen together to achieve the best possible performance. Our results also shed light on good choices for each element depending on the task, the effect of normalisation layers, and the utility of SSL pre-training, pointing to potential directions for designing FL-specific architectures and training pipelines.

CVDec 5, 2025Code
The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning

Akis Linardos, Sarthak Pati, Ujjwal Baid et al.

We present the design and results of the MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024, which focuses on federated learning (FL) for glioma sub-region segmentation in multi-parametric MRI and evaluates new weight aggregation methods aimed at improving robustness and efficiency. Six participating teams were evaluated using a standardized FL setup and a multi-institutional dataset derived from the BraTS glioma benchmark, consisting of 1,251 training cases, 219 validation cases, and 570 hidden test cases with segmentations for enhancing tumor (ET), tumor core (TC), and whole tumor (WT). Teams were ranked using a cumulative scoring system that considered both segmentation performance, measured by Dice Similarity Coefficient (DSC) and the 95th percentile Hausdorff Distance (HD95), and communication efficiency assessed through the convergence score. A PID-controller-based method achieved the top overall ranking, obtaining mean DSC values of 0.733, 0.761, and 0.751 for ET, TC, and WT, respectively, with corresponding HD95 values of 33.922 mm, 33.623 mm, and 32.309 mm, while also demonstrating the highest communication efficiency with a convergence score of 0.764. These findings advance the state of federated learning for medical imaging, surpassing top-performing methods from previous challenge iterations and highlighting PID controllers as effective mechanisms for stabilizing and optimizing weight aggregation in FL. The challenge code is available at https://github.com/FeTS-AI/Challenge.

IVJun 27, 2025Code
FedCLAM: Client Adaptive Momentum with Foreground Intensity Matching for Federated Medical Image Segmentation

Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni

Federated learning is a decentralized training approach that keeps data under stakeholder control while achieving superior performance over isolated training. While inter-institutional feature discrepancies pose a challenge in all federated settings, medical imaging is particularly affected due to diverse imaging devices and population variances, which can diminish the global model's effectiveness. Existing aggregation methods generally fail to adapt across varied circumstances. To address this, we propose FedCLAM, which integrates \textit{client-adaptive momentum} terms derived from each client's loss reduction during local training, as well as a \textit{personalized dampening factor} to curb overfitting. We further introduce a novel \textit{intensity alignment} loss that matches predicted and ground-truth foreground distributions to handle heterogeneous image intensity profiles across institutions and devices. Extensive evaluations on two datasets show that FedCLAM surpasses eight cutting-edge methods in medical segmentation tasks, underscoring its efficacy. The code is available at https://github.com/siomvas/FedCLAM.

AIMay 22, 2023Code
NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research using Knowledge Graphs in Visual Relationship Detection

David Herron, Ernesto Jiménez-Ruiz, Giacomo Tarroni et al.

NeSy4VRD is a multifaceted resource designed to support the development of neurosymbolic AI (NeSy) research. NeSy4VRD re-establishes public access to the images of the VRD dataset and couples them with an extensively revised, quality-improved version of the VRD visual relationship annotations. Crucially, NeSy4VRD provides a well-aligned, companion OWL ontology that describes the dataset domain.It comes with open source infrastructure that provides comprehensive support for extensibility of the annotations (which, in turn, facilitates extensibility of the ontology), and open source code for loading the annotations to/from a knowledge graph. We are contributing NeSy4VRD to the computer vision, NeSy and Semantic Web communities to help foster more NeSy research using OWL-based knowledge graphs.

IVDec 23, 2023
Sample selection with noise rate estimation in noise learning of medical image analysis

Maolin Li, Giacomo Tarroni

In the field of medical image analysis, deep learning models have demonstrated remarkable success in enhancing diagnostic accuracy and efficiency. However, the reliability of these models is heavily dependent on the quality of training data, and the existence of label noise (errors in dataset annotations) of medical image data presents a significant challenge. This paper introduces a new sample selection method that enhances the performance of neural networks when trained on noisy datasets. Our approach features estimating the noise rate of a dataset by analyzing the distribution of loss values using Linear Regression. Samples are then ranked according to their loss values, and potentially noisy samples are excluded from the dataset. Additionally, we employ sparse regularization to further enhance the noise robustness of our model. Our proposed method is evaluated on five benchmark datasets and a real-life noisy medical image dataset. Notably, two of these datasets contain 3D medical images. The results of our experiments show that our method outperforms existing noise-robust learning methods, particularly in scenarios with high noise rates. Key words: noise-robust learning, medical image analysis, noise rate estimation, sample selection, sparse regularization

IVAug 7, 2021
Enhancing MR Image Segmentation with Realistic Adversarial Data Augmentation

Chen Chen, Chen Qin, Cheng Ouyang et al.

The success of neural networks on medical image segmentation tasks typically relies on large labeled datasets for model training. However, acquiring and manually labeling a large medical image set is resource-intensive, expensive, and sometimes impractical due to data sharing and privacy issues. To address this challenge, we propose AdvChain, a generic adversarial data augmentation framework, aiming at improving both the diversity and effectiveness of training data for medical image segmentation tasks. AdvChain augments data with dynamic data augmentation, generating randomly chained photo-metric and geometric transformations to resemble realistic yet challenging imaging variations to expand training data. By jointly optimizing the data augmentation model and a segmentation network during training, challenging examples are generated to enhance network generalizability for the downstream task. The proposed adversarial data augmentation does not rely on generative networks and can be used as a plug-in module in general segmentation networks. It is computationally efficient and applicable for both low-shot supervised and semi-supervised learning. We analyze and evaluate the method on two MR image segmentation tasks: cardiac segmentation and prostate segmentation with limited labeled data. Results show that the proposed approach can alleviate the need for labeled data while improving model generalization ability, indicating its practical value in medical imaging applications.

IVJun 9, 2021
Implicit field learning for unsupervised anomaly detection in medical images

Sergio Naval Marimont, Giacomo Tarroni

We propose a novel unsupervised out-of-distribution detection method for medical images based on implicit fields image representations. In our approach, an auto-decoder feed-forward neural network learns the distribution of healthy images in the form of a mapping between spatial coordinates and probabilities over a proxy for tissue types. At inference time, the learnt distribution is used to retrieve, from a given test image, a restoration, i.e. an image maximally consistent with the input one but belonging to the healthy distribution. Anomalies are localized using the voxel-wise probability predicted by our model for the restored image. We tested our approach in the task of unsupervised localization of gliomas on brain MR images and compared it to several other VAE-based anomaly detection methods. Results show that the proposed technique substantially outperforms them (average DICE 0.640 vs 0.518 for the best performing VAE-based alternative) while also requiring considerably less computing time.

CVDec 12, 2020
Anomaly detection through latent space restoration using vector-quantized variational autoencoders

Sergio Naval Marimont, Giacomo Tarroni

We propose an out-of-distribution detection method that combines density and restoration-based approaches using Vector-Quantized Variational Auto-Encoders (VQ-VAEs). The VQ-VAE model learns to encode images in a categorical latent space. The prior distribution of latent codes is then modelled using an Auto-Regressive (AR) model. We found that the prior probability estimated by the AR model can be useful for unsupervised anomaly detection and enables the estimation of both sample and pixel-wise anomaly scores. The sample-wise score is defined as the negative log-likelihood of the latent variables above a threshold selecting highly unlikely codes. Additionally, out-of-distribution images are restored into in-distribution images by replacing unlikely latent codes with samples from the prior model and decoding to pixel space. The average L1 distance between generated restorations and original image is used as pixel-wise anomaly score. We tested our approach on the MOOD challenge datasets, and report higher accuracies compared to a standard reconstruction-based approach with VAEs.

IVJun 23, 2020
Deep Generative Model-based Quality Control for Cardiac MRI Segmentation

Shuo Wang, Giacomo Tarroni, Chen Qin et al.

In recent years, convolutional neural networks have demonstrated promising performance in a variety of medical image segmentation tasks. However, when a trained segmentation model is deployed into the real clinical world, the model may not perform optimally. A major challenge is the potential poor-quality segmentations generated due to degraded image quality or domain shift issues. There is a timely need to develop an automated quality control method that can detect poor segmentations and feedback to clinicians. Here we propose a novel deep generative model-based framework for quality control of cardiac MRI segmentation. It first learns a manifold of good-quality image-segmentation pairs using a generative model. The quality of a given test segmentation is then assessed by evaluating the difference from its projection onto the good-quality manifold. In particular, the projection is refined through iterative search in the latent space. The proposed method achieves high prediction accuracy on two publicly available cardiac MRI datasets. Moreover, it shows better generalisation ability than traditional regression-based methods. Our approach provides a real-time and model-agnostic quality control for cardiac MRI segmentation, which has the potential to be integrated into clinical image analysis workflows.

IVJun 23, 2020
Realistic Adversarial Data Augmentation for MR Image Segmentation

Chen Chen, Chen Qin, Huaqi Qiu et al.

Neural network-based approaches can achieve high accuracy in various medical image segmentation tasks. However, they generally require large labelled datasets for supervised learning. Acquiring and manually labelling a large medical dataset is expensive and sometimes impractical due to data sharing and privacy issues. In this work, we propose an adversarial data augmentation method for training neural networks for medical image segmentation. Instead of generating pixel-wise adversarial attacks, our model generates plausible and realistic signal corruptions, which models the intensity inhomogeneities caused by a common type of artefacts in MR imaging: bias field. The proposed method does not rely on generative networks, and can be used as a plug-in module for general segmentation networks in both supervised and semi-supervised learning. Using cardiac MR imaging we show that such an approach can improve the generalization ability and robustness of models as well as provide significant improvements in low-data scenarios.

LGJun 11, 2020
Anti-Transfer Learning for Task Invariance in Convolutional Neural Networks for Speech Processing

Eric Guizzo, Tillman Weyde, Giacomo Tarroni

We introduce the novel concept of anti-transfer learning for speech processing with convolutional neural networks. While transfer learning assumes that the learning process for a target task will benefit from re-using representations learned for another task, anti-transfer avoids the learning of representations that have been learned for an orthogonal task, i.e., one that is not relevant and potentially misleading for the target task, such as speaker identity for speech recognition or speech content for emotion recognition. In anti-transfer learning, we penalize similarity between activations of a network being trained and another one previously trained on an orthogonal task, which yields more suitable representations. This leads to better generalization and provides a degree of control over correlations that are spurious or undesirable, e.g. to avoid social bias. We have implemented anti-transfer for convolutional neural networks in different configurations with several similarity metrics and aggregation functions, which we evaluate and analyze with several speech and audio tasks and settings, using six datasets. We show that anti-transfer actually leads to the intended invariance to the orthogonal task and to more appropriate features for the target task at hand. Anti-transfer learning consistently improves classification accuracy in all test cases. While anti-transfer creates computation and memory cost at training time, there is relatively little computation cost when using pre-trained models for orthogonal tasks. Anti-transfer is widely applicable and particularly useful where a specific invariance is desirable or where trained models are available and labeled data for orthogonal tasks are difficult to obtain.

IVNov 9, 2019
Deep learning for cardiac image segmentation: A review

Chen Chen, Chen Qin, Huaqi Qiu et al.

Deep learning has become the most widely used approach for cardiac image segmentation in recent years. In this paper, we provide a review of over 100 cardiac image segmentation papers using deep learning, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound (US) and major anatomical structures of interest (ventricles, atria and vessels). In addition, a summary of publicly available cardiac image datasets and code repositories are included to provide a base for encouraging reproducible research. Finally, we discuss the challenges and limitations with current deep learning-based approaches (scarcity of labels, model generalizability across different domains, interpretability) and suggest potential directions for future research.

IVAug 20, 2019
Unsupervised Multi-modal Style Transfer for Cardiac MR Segmentation

Chen Chen, Cheng Ouyang, Giacomo Tarroni et al.

In this work, we present a fully automatic method to segment cardiac structures from late-gadolinium enhanced (LGE) images without using labelled LGE data for training, but instead by transferring the anatomical knowledge and features learned on annotated balanced steady-state free precession (bSSFP) images, which are easier to acquire. Our framework mainly consists of two neural networks: a multi-modal image translation network for style transfer and a cascaded segmentation network for image segmentation. The multi-modal image translation network generates realistic and diverse synthetic LGE images conditioned on a single annotated bSSFP image, forming a synthetic LGE training set. This set is then utilized to fine-tune the segmentation network pre-trained on labelled bSSFP images, achieving the goal of unsupervised LGE image segmentation. In particular, the proposed cascaded segmentation network is able to produce accurate segmentation by taking both shape prior and image appearance into account, achieving an average Dice score of 0.92 for the left ventricle, 0.83 for the myocardium, and 0.88 for the right ventricle on the test set.

IVJul 23, 2019
Learning Shape Priors for Robust Cardiac MR Segmentation from Multi-view Images

Chen Chen, Carlo Biffi, Giacomo Tarroni et al.

Cardiac MR image segmentation is essential for the morphological and functional analysis of the heart. Inspired by how experienced clinicians assess the cardiac morphology and function across multiple standard views (i.e. long- and short-axis views), we propose a novel approach which learns anatomical shape priors across different 2D standard views and leverages these priors to segment the left ventricular (LV) myocardium from short-axis MR image stacks. The proposed segmentation method has the advantage of being a 2D network but at the same time incorporates spatial context from multiple, complementary views that span a 3D space. Our method achieves accurate and robust segmentation of the myocardium across different short-axis slices (from apex to base), outperforming baseline models (e.g. 2D U-Net, 3D U-Net) while achieving higher data efficiency. Compared to the 2D U-Net, the proposed method reduces the mean Hausdorff distance (mm) from 3.24 to 2.49 on the apical slices, from 2.34 to 2.09 on the middle slices and from 3.62 to 2.76 on the basal slices on the test set, when only 10% of the training data was used.

CVJul 5, 2019
Self-Supervised Learning for Cardiac MR Image Segmentation by Anatomical Position Prediction

Wenjia Bai, Chen Chen, Giacomo Tarroni et al.

In the recent years, convolutional neural networks have transformed the field of medical image analysis due to their capacity to learn discriminative image features for a variety of classification and regression tasks. However, successfully learning these features requires a large amount of manually annotated data, which is expensive to acquire and limited by the available resources of expert image analysts. Therefore, unsupervised, weakly-supervised and self-supervised feature learning techniques receive a lot of attention, which aim to utilise the vast amount of available data, while at the same time avoid or substantially reduce the effort of manual annotation. In this paper, we propose a novel way for training a cardiac MR image segmentation network, in which features are learnt in a self-supervised manner by predicting anatomical positions. The anatomical positions serve as a supervisory signal and do not require extra manual annotation. We demonstrate that this seemingly simple task provides a strong signal for feature learning and with self-supervised learning, we achieve a high segmentation accuracy that is better than or comparable to a U-net trained from scratch, especially at a small data setting. When only five annotated subjects are available, the proposed method improves the mean Dice metric from 0.811 to 0.852 for short-axis image segmentation, compared to the baseline U-net.

IVJun 28, 2019
Explainable Anatomical Shape Analysis through Deep Hierarchical Generative Models

Carlo Biffi, Juan J. Cerrolaza, Giacomo Tarroni et al.

Quantification of anatomical shape changes currently relies on scalar global indexes which are largely insensitive to regional or asymmetric modifications. Accurate assessment of pathology-driven anatomical remodeling is a crucial step for the diagnosis and treatment of many conditions. Deep learning approaches have recently achieved wide success in the analysis of medical images, but they lack interpretability in the feature extraction and decision processes. In this work, we propose a new interpretable deep learning model for shape analysis. In particular, we exploit deep generative networks to model a population of anatomical segmentations through a hierarchy of conditional latent variables. At the highest level of this hierarchy, a two-dimensional latent space is simultaneously optimised to discriminate distinct clinical conditions, enabling the direct visualisation of the classification space. Moreover, the anatomical variability encoded by this discriminative latent space can be visualised in the segmentation space thanks to the generative properties of the model, making the classification task transparent. This approach yielded high accuracy in the categorisation of healthy and remodelled left ventricles when tested on unseen segmentations from our own multi-centre dataset as well as in an external validation set, and on hippocampi from healthy controls and patients with Alzheimer's disease when tested on ADNI data. More importantly, it enabled the visualisation in three-dimensions of both global and regional anatomical features which better discriminate between the conditions under exam. The proposed approach scales effectively to large populations, facilitating high-throughput analysis of normal anatomy and pathology in large-scale studies of volumetric imaging.

CVFeb 28, 2019
3D High-Resolution Cardiac Segmentation Reconstruction from 2D Views using Conditional Variational Autoencoders

Carlo Biffi, Juan J. Cerrolaza, Giacomo Tarroni et al.

Accurate segmentation of heart structures imaged by cardiac MR is key for the quantitative analysis of pathology. High-resolution 3D MR sequences enable whole-heart structural imaging but are time-consuming, expensive to acquire and they often require long breath holds that are not suitable for patients. Consequently, multiplanar breath-hold 2D cine sequences are standard practice but are disadvantaged by lack of whole-heart coverage and low through-plane resolution. To address this, we propose a conditional variational autoencoder architecture able to learn a generative model of 3D high-resolution left ventricular (LV) segmentations which is conditioned on three 2D LV segmentations of one short-axis and two long-axis images. By only employing these three 2D segmentations, our model can efficiently reconstruct the 3D high-resolution LV segmentation of a subject. When evaluated on 400 unseen healthy volunteers, our model yielded an average Dice score of $87.92 \pm 0.15$ and outperformed competing architectures.

CVOct 3, 2018
A Comprehensive Approach for Learning-based Fully-Automated Inter-slice Motion Correction for Short-Axis Cine Cardiac MR Image Stacks

Giacomo Tarroni, Ozan Oktay, Matthew Sinclair et al.

In the clinical routine, short axis (SA) cine cardiac MR (CMR) image stacks are acquired during multiple subsequent breath-holds. If the patient cannot consistently hold the breath at the same position, the acquired image stack will be affected by inter-slice respiratory motion and will not correctly represent the cardiac volume, introducing potential errors in the following analyses and visualisations. We propose an approach to automatically correct inter-slice respiratory motion in SA CMR image stacks. Our approach makes use of probabilistic segmentation maps (PSMs) of the left ventricular (LV) cavity generated with decision forests. PSMs are generated for each slice of the SA stack and rigidly registered in-plane to a target PSM. If long axis (LA) images are available, PSMs are generated for them and combined to create the target PSM; if not, the target PSM is produced from the same stack using a 3D model trained from motion-free stacks. The proposed approach was tested on a dataset of SA stacks acquired from 24 healthy subjects (for which anatomical 3D cardiac images were also available as reference) and compared to two techniques which use LA intensity images and LA segmentations as targets, respectively. The results show the accuracy and robustness of the proposed approach in motion compensation.

CVAug 1, 2018
Recurrent neural networks for aortic image sequence segmentation with sparse annotations

Wenjia Bai, Hideaki Suzuki, Chen Qin et al.

Segmentation of image sequences is an important task in medical image analysis, which enables clinicians to assess the anatomy and function of moving organs. However, direct application of a segmentation algorithm to each time frame of a sequence may ignore the temporal continuity inherent in the sequence. In this work, we propose an image sequence segmentation algorithm by combining a fully convolutional network with a recurrent neural network, which incorporates both spatial and temporal information into the segmentation task. A key challenge in training this network is that the available manual annotations are temporally sparse, which forbids end-to-end training. We address this challenge by performing non-rigid label propagation on the annotations and introducing an exponentially weighted loss function for training. Experiments on aortic MR image sequences demonstrate that the proposed method significantly improves both accuracy and temporal smoothness of segmentation, compared to a baseline method that utilises spatial information only. It achieves an average Dice metric of 0.960 for the ascending aorta and 0.953 for the descending aorta.

CVJul 18, 2018
Learning Interpretable Anatomical Features Through Deep Generative Models: Application to Cardiac Remodeling

Carlo Biffi, Ozan Oktay, Giacomo Tarroni et al.

Alterations in the geometry and function of the heart define well-established causes of cardiovascular disease. However, current approaches to the diagnosis of cardiovascular diseases often rely on subjective human assessment as well as manual analysis of medical images. Both factors limit the sensitivity in quantifying complex structural and functional phenotypes. Deep learning approaches have recently achieved success for tasks such as classification or segmentation of medical images, but lack interpretability in the feature extraction and decision processes, limiting their value in clinical diagnosis. In this work, we propose a 3D convolutional generative model for automatic classification of images from patients with cardiac diseases associated with structural remodeling. The model leverages interpretable task-specific anatomic patterns learned from 3D segmentations. It further allows to visualise and quantify the learned pathology-specific remodeling patterns in the original input space of the images. This approach yields high accuracy in the categorization of healthy and hypertrophic cardiomyopathy subjects when tested on unseen MR images from our own multi-centre dataset (100%) as well on the ACDC MICCAI 2017 dataset (90%). We believe that the proposed deep learning approach is a promising step towards the development of interpretable classifiers for the medical imaging domain, which may help clinicians to improve diagnostic accuracy and enhance patient risk-stratification.

CVMar 25, 2018
Learning-Based Quality Control for Cardiac MR Images

Giacomo Tarroni, Ozan Oktay, Wenjia Bai et al.

The effectiveness of a cardiovascular magnetic resonance (CMR) scan depends on the ability of the operator to correctly tune the acquisition parameters to the subject being scanned and on the potential occurrence of imaging artefacts such as cardiac and respiratory motion. In the clinical practice, a quality control step is performed by visual assessment of the acquired images: however, this procedure is strongly operator-dependent, cumbersome and sometimes incompatible with the time constraints in clinical settings and large-scale studies. We propose a fast, fully-automated, learning-based quality control pipeline for CMR images, specifically for short-axis image stacks. Our pipeline performs three important quality checks: 1) heart coverage estimation, 2) inter-slice motion detection, 3) image contrast estimation in the cardiac region. The pipeline uses a hybrid decision forest method - integrating both regression and structured classification models - to extract landmarks as well as probabilistic segmentation maps from both long- and short-axis images as a basis to perform the quality checks. The technique was tested on up to 3000 cases from the UK Biobank as well as on 100 cases from the UK Digital Heart Project, and validated against manual annotations and visual inspections performed by expert interpreters. The results show the capability of the proposed pipeline to correctly detect incomplete or corrupted scans (e.g. on UK Biobank, sensitivity and specificity respectively 88% and 99% for heart coverage estimation, 85% and 95% for motion detection), allowing their exclusion from the analysed dataset or the triggering of a new acquisition.

CVOct 25, 2017
Automated cardiovascular magnetic resonance image analysis with fully convolutional networks

Wenjia Bai, Matthew Sinclair, Giacomo Tarroni et al.

Cardiovascular magnetic resonance (CMR) imaging is a standard imaging modality for assessing cardiovascular diseases (CVDs), the leading cause of death globally. CMR enables accurate quantification of the cardiac chamber volume, ejection fraction and myocardial mass, providing information for diagnosis and monitoring of CVDs. However, for years, clinicians have been relying on manual approaches for CMR image analysis, which is time consuming and prone to subjective errors. It is a major clinical challenge to automatically derive quantitative and clinically relevant information from CMR images. Deep neural networks have shown a great potential in image pattern recognition and segmentation for a variety of tasks. Here we demonstrate an automated analysis method for CMR images, which is based on a fully convolutional network (FCN). The network is trained and evaluated on a large-scale dataset from the UK Biobank, consisting of 4,875 subjects with 93,500 pixelwise annotated images. The performance of the method has been evaluated using a number of technical metrics, including the Dice metric, mean contour distance and Hausdorff distance, as well as clinically relevant measures, including left ventricle (LV) end-diastolic volume (LVEDV) and end-systolic volume (LVESV), LV mass (LVM); right ventricle (RV) end-diastolic volume (RVEDV) and end-systolic volume (RVESV). By combining FCN with a large-scale annotated dataset, the proposed automated method achieves a high performance on par with human experts in segmenting the LV and RV on short-axis CMR images and the left atrium (LA) and right atrium (RA) on long-axis CMR images.

CVJun 3, 2016
Learning under Distributed Weak Supervision

Martin Rajchl, Matthew C. H. Lee, Franklin Schrans et al.

The availability of training data for supervision is a frequently encountered bottleneck of medical image analysis methods. While typically established by a clinical expert rater, the increase in acquired imaging data renders traditional pixel-wise segmentations less feasible. In this paper, we examine the use of a crowdsourcing platform for the distribution of super-pixel weak annotation tasks and collect such annotations from a crowd of non-expert raters. The crowd annotations are subsequently used for training a fully convolutional neural network to address the problem of fetal brain segmentation in T2-weighted MR images. Using this approach we report encouraging results compared to highly targeted, fully supervised methods and potentially address a frequent problem impeding image analysis research.