IVApr 6, 2022
Mitosis domain generalization in histopathology images -- The MIDOG challengeMarc Aubreville, Nikolas Stathonikos, Christof A. Bertram et al.
The density of mitotic figures within tumor tissue is known to be highly correlated with tumor proliferation and thus is an important marker in tumor grading. Recognition of mitotic figures by pathologists is known to be subject to a strong inter-rater bias, which limits the prognostic value. State-of-the-art deep learning methods can support the expert in this assessment but are known to strongly deteriorate when applied in a different clinical environment than was used for training. One decisive component in the underlying domain shift has been identified as the variability caused by using different whole slide scanners. The goal of the MICCAI MIDOG 2021 challenge has been to propose and evaluate methods that counter this domain shift and derive scanner-agnostic mitosis detection algorithms. The challenge used a training set of 200 cases, split across four scanning systems. As a test set, an additional 100 cases split across four scanning systems, including two previously unseen scanners, were given. The best approaches performed on an expert level, with the winning algorithm yielding an F_1 score of 0.748 (CI95: 0.704-0.781). In this paper, we evaluate and compare the approaches that were submitted to the challenge and identify methodological factors contributing to better performance.
CVNov 18, 2023
Radiology Report Generation Using Transformers Conditioned with Non-imaging DataNurbanu Aksoy, Nishant Ravikumar, Alejandro F Frangi
Medical image interpretation is central to most clinical applications such as disease diagnosis, treatment planning, and prognostication. In clinical practice, radiologists examine medical images and manually compile their findings into reports, which can be a time-consuming process. Automated approaches to radiology report generation, therefore, can reduce radiologist workload and improve efficiency in the clinical pathway. While recent deep-learning approaches for automated report generation from medical images have seen some success, most studies have relied on image-derived features alone, ignoring non-imaging patient data. Although a few studies have included the word-level contexts along with the image, the use of patient demographics is still unexplored. This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information, to synthesise patient-specific radiology reports. The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information, to synthesise full-text radiology reports. Data from two public databases were used to train and evaluate the proposed approach. CXRs and reports were extracted from the MIMIC-CXR database and combined with corresponding patients' data MIMIC-IV. Based on the evaluation metrics used including patient demographic information was found to improve the quality of reports generated using the proposed approach, relative to a baseline network trained using CXRs alone. The proposed approach shows potential for enhancing radiology report generation by leveraging rich patient metadata and combining semantic text embeddings derived thereof, with medical image-derived visual features.
CVJul 1, 2022
Agent with Tangent-based Formulation and Anatomical Perception for Standard Plane Localization in 3D UltrasoundYuxin Zou, Haoran Dou, Yuhao Huang et al.
Standard plane (SP) localization is essential in routine clinical ultrasound (US) diagnosis. Compared to 2D US, 3D US can acquire multiple view planes in one scan and provide complete anatomy with the addition of coronal plane. However, manually navigating SPs in 3D US is laborious and biased due to the orientation variability and huge search space. In this study, we introduce a novel reinforcement learning (RL) framework for automatic SP localization in 3D US. Our contribution is three-fold. First, we formulate SP localization in 3D US as a tangent-point-based problem in RL to restructure the action space and significantly reduce the search space. Second, we design an auxiliary task learning strategy to enhance the model's ability to recognize subtle differences crossing Non-SPs and SPs in plane search. Finally, we propose a spatial-anatomical reward to effectively guide learning trajectories by exploiting spatial and anatomical information simultaneously. We explore the efficacy of our approach on localizing four SPs on uterus and fetal brain datasets. The experiments indicate that our approach achieves a high localization accuracy as well as robust performance.
IVJun 26, 2023
GSMorph: Gradient Surgery for cine-MRI Cardiac Deformable RegistrationHaoran Dou, Ning Bi, Luyi Han et al.
Deep learning-based deformable registration methods have been widely investigated in diverse medical applications. Learning-based deformable registration relies on weighted objective functions trading off registration accuracy and smoothness of the deformation field. Therefore, they inevitably require tuning the hyperparameter for optimal registration performance. Tuning the hyperparameters is highly computationally expensive and introduces undesired dependencies on domain knowledge. In this study, we construct a registration model based on the gradient surgery mechanism, named GSMorph, to achieve a hyperparameter-free balance on multiple losses. In GSMorph, we reformulate the optimization procedure by projecting the gradient of similarity loss orthogonally to the plane associated with the smoothness constraint, rather than additionally introducing a hyperparameter to balance these two competing terms. Furthermore, our method is model-agnostic and can be merged into any deep registration network without introducing extra parameters or slowing down inference. In this study, We compared our method with state-of-the-art (SOTA) deformable registration approaches over two publicly available cardiac MRI datasets. GSMorph proves superior to five SOTA learning-based registration models and two conventional registration techniques, SyN and Demons, on both registration accuracy and smoothness.
IVAug 13, 2023
Shape-guided Conditional Latent Diffusion Models for Synthesising Brain VasculatureYash Deo, Haoran Dou, Nishant Ravikumar et al.
The Circle of Willis (CoW) is the part of cerebral vasculature responsible for delivering blood to the brain. Understanding the diverse anatomical variations and configurations of the CoW is paramount to advance research on cerebrovascular diseases and refine clinical interventions. However, comprehensive investigation of less prevalent CoW variations remains challenging because of the dominance of a few commonly occurring configurations. We propose a novel generative approach utilising a conditional latent diffusion model with shape and anatomical guidance to generate realistic 3D CoW segmentations, including different phenotypical variations. Our conditional latent diffusion model incorporates shape guidance to better preserve vessel continuity and demonstrates superior performance when compared to alternative generative models, including conditional variants of 3D GAN and 3D VAE. We observed that our model generated CoW variants that are more realistic and demonstrate higher visual fidelity than competing approaches with an FID score 53\% better than the best-performing GAN-based model.
CVFeb 6, 2023
Learning disentangled representations for explainable chest X-ray classification using Dirichlet VAEsRachael Harkness, Alejandro F Frangi, Kieran Zucker et al.
This study explores the use of the Dirichlet Variational Autoencoder (DirVAE) for learning disentangled latent representations of chest X-ray (CXR) images. Our working hypothesis is that distributional sparsity, as facilitated by the Dirichlet prior, will encourage disentangled feature learning for the complex task of multi-label classification of CXR images. The DirVAE is trained using CXR images from the CheXpert database, and the predictive capacity of multi-modal latent representations learned by DirVAE models is investigated through implementation of an auxiliary multi-label classification task, with a view to enforce separation of latent factors according to class-specific features. The predictive performance and explainability of the latent space learned using the DirVAE were quantitatively and qualitatively assessed, respectively, and compared with a standard Gaussian prior-VAE (GVAE). We introduce a new approach for explainable multi-label classification in which we conduct gradient-guided latent traversals for each class of interest. Study findings indicate that the DirVAE is able to disentangle latent factors into class-specific visual features, a property not afforded by the GVAE, and achieve a marginal increase in predictive performance relative to GVAE. We generate visual examples to show that our explainability method, when applied to the trained DirVAE, is able to highlight regions in CXR images that are clinically relevant to the class(es) of interest and additionally, can identify cases where classification relies on spurious feature correlations.
CVNov 24, 2022
Joint segmentation and discontinuity-preserving deformable registration: Application to cardiac cine-MR imagesXiang Chen, Yan Xia, Nishant Ravikumar et al.
Medical image registration is a challenging task involving the estimation of spatial transformations to establish anatomical correspondence between pairs or groups of images. Recently, deep learning-based image registration methods have been widely explored, and demonstrated to enable fast and accurate image registration in a variety of applications. However, most deep learning-based registration methods assume that the deformation fields are smooth and continuous everywhere in the image domain, which is not always true, especially when registering images whose fields of view contain discontinuities at tissue/organ boundaries. In such scenarios, enforcing smooth, globally continuous deformation fields leads to incorrect/implausible registration results. We propose a novel discontinuity-preserving image registration method to tackle this challenge, which ensures globally discontinuous and locally smooth deformation fields, leading to more accurate and realistic registration results. The proposed method leverages the complementary nature of image segmentation and registration and enables joint segmentation and pair-wise registration of images. A co-attention block is proposed in the segmentation component of the network to learn the structural correlations in the input images, while a discontinuity-preserving registration strategy is employed in the registration component of the network to ensure plausibility in the estimated deformation fields at tissue/organ interfaces. We evaluate our method on the task of intra-subject spatio-temporal image registration using large-scale cinematic cardiac magnetic resonance image sequences, and demonstrate that our method achieves significant improvements over the state-of-the-art for medical image registration, and produces high-quality segmentation masks for the regions of interest.
IVMar 31, 2023
Artificial Intelligence in Ovarian Cancer Histopathology: A Systematic ReviewJack Breen, Katie Allen, Kieran Zucker et al.
Purpose - To characterise and assess the quality of published research evaluating artificial intelligence (AI) methods for ovarian cancer diagnosis or prognosis using histopathology data. Methods - A search of PubMed, Scopus, Web of Science, CENTRAL, and WHO-ICTRP was conducted up to 19/05/2023. The inclusion criteria required that research evaluated AI on histopathology images for diagnostic or prognostic inferences in ovarian cancer. The risk of bias was assessed using PROBAST. Information about each model of interest was tabulated and summary statistics were reported. PRISMA 2020 reporting guidelines were followed. Results - 1573 records were identified, of which 45 were eligible for inclusion. There were 80 models of interest, including 37 diagnostic models, 22 prognostic models, and 21 models with other diagnostically relevant outcomes. Models were developed using 1-1375 slides from 1-776 ovarian cancer patients. Model outcomes included treatment response (11/80), malignancy status (10/80), stain quantity (9/80), and histological subtype (7/80). All models were found to be at high or unclear risk of bias overall, with most research having a high risk of bias in the analysis and a lack of clarity regarding participants and predictors in the study. Research frequently suffered from insufficient reporting and limited validation using small sample sizes. Conclusion - Limited research has been conducted on the application of AI to histopathology images for diagnostic or prognostic purposes in ovarian cancer, and none of the associated models have been demonstrated to be ready for real-world implementation. Key aspects to help ensure clinical translation include more transparent and comprehensive reporting of data provenance and modelling approaches, as well as improved quantitative performance evaluation using cross-validation and external validations.
IVAug 7, 2023
Adaptive Semi-Supervised Segmentation of Brain Vessels with Ambiguous LabelsFengming Lin, Yan Xia, Nishant Ravikumar et al.
Accurate segmentation of brain vessels is crucial for cerebrovascular disease diagnosis and treatment. However, existing methods face challenges in capturing small vessels and handling datasets that are partially or ambiguously annotated. In this paper, we propose an adaptive semi-supervised approach to address these challenges. Our approach incorporates innovative techniques including progressive semi-supervised learning, adaptative training strategy, and boundary enhancement. Experimental results on 3DRA datasets demonstrate the superiority of our method in terms of mesh-based segmentation metrics. By leveraging the partially and ambiguously labeled data, which only annotates the main vessels, our method achieves impressive segmentation performance on mislabeled fine vessels, showcasing its potential for clinical applications.
IVJun 26, 2023
A Conditional Flow Variational Autoencoder for Controllable Synthesis of Virtual Populations of AnatomyHaoran Dou, Nishant Ravikumar, Alejandro F. Frangi
The generation of virtual populations (VPs) of anatomy is essential for conducting in silico trials of medical devices. Typically, the generated VP should capture sufficient variability while remaining plausible and should reflect the specific characteristics and demographics of the patients observed in real populations. In several applications, it is desirable to synthesise virtual populations in a \textit{controlled} manner, where relevant covariates are used to conditionally synthesise virtual populations that fit a specific target population/characteristics. We propose to equip a conditional variational autoencoder (cVAE) with normalising flows to boost the flexibility and complexity of the approximate posterior learnt, leading to enhanced flexibility for controllable synthesis of VPs of anatomical structures. We demonstrate the performance of our conditional flow VAE using a data set of cardiac left ventricles acquired from 2360 patients, with associated demographic information and clinical measurements (used as covariates/conditional information). The results obtained indicate the superiority of the proposed method for conditional synthesis of virtual populations of cardiac left ventricles relative to a cVAE. Conditional synthesis performance was evaluated in terms of generalisation and specificity errors and in terms of the ability to preserve clinically relevant biomarkers in synthesised VPs, that is, the left ventricular blood pool and myocardial volume, relative to the real observed population.
IVJun 30, 2022
Localizing the Recurrent Laryngeal Nerve via Ultrasound with a Bayesian Shape FrameworkHaoran Dou, Luyi Han, Yushuang He et al.
Tumor infiltration of the recurrent laryngeal nerve (RLN) is a contraindication for robotic thyroidectomy and can be difficult to detect via standard laryngoscopy. Ultrasound (US) is a viable alternative for RLN detection due to its safety and ability to provide real-time feedback. However, the tininess of the RLN, with a diameter typically less than 3mm, poses significant challenges to the accurate localization of the RLN. In this work, we propose a knowledge-driven framework for RLN localization, mimicking the standard approach surgeons take to identify the RLN according to its surrounding organs. We construct a prior anatomical model based on the inherent relative spatial relationships between organs. Through Bayesian shape alignment (BSA), we obtain the candidate coordinates of the center of a region of interest (ROI) that encloses the RLN. The ROI allows a decreased field of view for determining the refined centroid of the RLN using a dual-path identification network, based on multi-scale semantic information. Experimental results indicate that the proposed method achieves superior hit rates and substantially smaller distance errors compared with state-of-the-art methods.
CVApr 20, 2022
Epistemic Uncertainty-Weighted Loss for Visual Bias MitigationRebecca S Stone, Nishant Ravikumar, Andrew J Bulpitt et al.
Deep neural networks are highly susceptible to learning biases in visual data. While various methods have been proposed to mitigate such bias, the majority require explicit knowledge of the biases present in the training data in order to mitigate. We argue the relevance of exploring methods which are completely ignorant of the presence of any bias, but are capable of identifying and mitigating them. Furthermore, we propose using Bayesian neural networks with a predictive uncertainty-weighted loss function to dynamically identify potential bias in individual training samples and to weight them during training. We find a positive correlation between samples subject to bias and higher epistemic uncertainties. Finally, we show the method has potential to mitigate visual bias on a bias benchmark dataset and on a real-world face detection problem, and we consider the merits and weaknesses of our approach.
GNJan 7, 2023
Unsupervised ensemble-based phenotyping helps enhance the discoverability of genes related to heart morphologyRodrigo Bonazzola, Enzo Ferrante, Nishant Ravikumar et al.
Recent genome-wide association studies (GWAS) have been successful in identifying associations between genetic variants and simple cardiac parameters derived from cardiac magnetic resonance (CMR) images. However, the emergence of big databases including genetic data linked to CMR, facilitates investigation of more nuanced patterns of shape variability. Here, we propose a new framework for gene discovery entitled Unsupervised Phenotype Ensembles (UPE). UPE builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner, using deep learning models trained with different hyperparameters. These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations across the ensemble. We apply our approach to the UK Biobank database to extract left-ventricular (LV) geometric features from image-derived three-dimensional meshes. We demonstrate that our approach greatly improves the discoverability of genes influencing LV shape, identifying 11 loci with study-wide significance and 8 with suggestive significance. We argue that our approach would enable more extensive discovery of gene associations with image-derived phenotypes for other organs or image modalities.
IVNov 22, 2023
Multi-view Hybrid Graph Convolutional Network for Volume-to-mesh Reconstruction in Cardiovascular MRINicolás Gaggion, Benjamin A. Matheson, Yan Xia et al.
Cardiovascular magnetic resonance imaging is emerging as a crucial tool to examine cardiac morphology and function. Essential to this endeavour are anatomical 3D surface and volumetric meshes derived from CMR images, which facilitate computational anatomy studies, biomarker discovery, and in-silico simulations. Traditional approaches typically follow complex multi-step pipelines, first segmenting images and then reconstructing meshes, making them time-consuming and prone to error propagation. In response, we introduce HybridVNet, a novel architecture for direct image-to-mesh extraction seamlessly integrating standard convolutional neural networks with graph convolutions, which we prove can efficiently handle surface and volumetric meshes by encoding them as graph structures. To further enhance accuracy, we propose a multi-view HybridVNet architecture which processes both long axis and short axis CMR, showing that it can increase the performance of cardiac MR mesh generation. Our model combines traditional convolutional networks with variational graph generative models, deep supervision and mesh-specific regularisation. Experiments on a comprehensive dataset from the UK Biobank confirm the potential of HybridVNet to significantly advance cardiac imaging and computational cardiology by efficiently generating high-fidelity meshes from CMR images. Multi-view HybridVNet outperforms the state-of-the-art, achieving improvements of up to $\sim$27\% reduction in Mean Contour Distance (from 1.86 mm to 1.35 mm for the LV Myocardium), up to $\sim$18\% improvement in Hausdorff distance (from 4.74 mm to 3.89mm, for the LV Endocardium), and up to $\sim$8\% in Dice Coefficient (from 0.78 to 0.84, for the LV Myocardium), highlighting its superior accuracy.
IVFeb 17, 2023
Efficient subtyping of ovarian cancer histopathology whole slide images using active sampling in multiple instance learningJack Breen, Katie Allen, Kieran Zucker et al.
Weakly-supervised classification of histopathology slides is a computationally intensive task, with a typical whole slide image (WSI) containing billions of pixels to process. We propose Discriminative Region Active Sampling for Multiple Instance Learning (DRAS-MIL), a computationally efficient slide classification method using attention scores to focus sampling on highly discriminative regions. We apply this to the diagnosis of ovarian cancer histological subtypes, which is an essential part of the patient care pathway as different subtypes have different genetic and molecular profiles, treatment options, and patient outcomes. We use a dataset of 714 WSIs acquired from 147 epithelial ovarian cancer patients at Leeds Teaching Hospitals NHS Trust to distinguish the most common subtype, high-grade serous carcinoma, from the other four subtypes (low-grade serous, endometrioid, clear cell, and mucinous carcinomas) combined. We demonstrate that DRAS-MIL can achieve similar classification performance to exhaustive slide analysis, with a 3-fold cross-validated AUC of 0.8679 compared to 0.8781 with standard attention-based MIL classification. Our approach uses at most 18% as much memory as the standard approach, while taking 33% of the time when evaluating on a GPU and only 14% on a CPU alone. Reducing prediction time and memory requirements may benefit clinical deployment and the democratisation of AI, reducing the extent to which computational hardware limits end-user adoption.
IVAug 24, 2023
Learned Local Attention Maps for Synthesising Vessel SegmentationsYash Deo, Rodrigo Bonazzola, Haoran Dou et al.
Magnetic resonance angiography (MRA) is an imaging modality for visualising blood vessels. It is useful for several diagnostic applications and for assessing the risk of adverse events such as haemorrhagic stroke (resulting from the rupture of aneurysms in blood vessels). However, MRAs are not acquired routinely, hence, an approach to synthesise blood vessel segmentations from more routinely acquired MR contrasts such as T1 and T2, would be useful. We present an encoder-decoder model for synthesising segmentations of the main cerebral arteries in the circle of Willis (CoW) from only T2 MRI. We propose a two-phase multi-objective learning approach, which captures both global and local features. It uses learned local attention maps generated by dilating the segmentation labels, which forces the network to only extract information from the T2 MRI relevant to synthesising the CoW. Our synthetic vessel segmentations generated from only T2 MRI achieved a mean Dice score of $0.79 \pm 0.03$ in testing, compared to state-of-the-art segmentation networks such as transformer U-Net ($0.71 \pm 0.04$) and nnU-net($0.68 \pm 0.05$), while using only a fraction of the parameters. The main qualitative difference between our synthetic vessel segmentations and the comparative models was in the sharper resolution of the CoW vessel segments, especially in the posterior circulation.
CVNov 18, 2023
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report GenerationNurbanu Aksoy, Serge Sharoff, Selcuk Baser et al.
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images. Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists. In this paper, we present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.We introduce a conditioned cross-multi-head attention module to fuse these heterogeneous data modalities, bridging the semantic gap between visual and textual data. Experiments demonstrate substantial improvements from using additional modalities compared to relying on images alone. Notably, our model achieves the highest reported performance on the ROUGE-L metric compared to relevant state-of-the-art models in the literature. Furthermore, we employed both human evaluation and clinical semantic similarity measurement alongside word-overlap metrics to improve the depth of quantitative analysis. A human evaluation, conducted by a board-certified radiologist, confirms the model's accuracy in identifying high-level findings, however, it also highlights that more improvement is needed to capture nuanced details and clinical context.
IVOct 4, 2022
A Generative Shape Compositional Framework to Synthesise Populations of Virtual ChimaerasHaoran Dou, Seppo Virtanen, Nishant Ravikumar et al.
Generating virtual populations of anatomy that capture sufficient variability while remaining plausible is essential for conducting in-silico trials of medical devices. However, not all anatomical shapes of interest are always available for each individual in a population. Hence, missing/partially-overlapping anatomical information is often available across individuals in a population. We introduce a generative shape model for complex anatomical structures, learnable from datasets of unpaired datasets. The proposed generative model can synthesise complete whole complex shape assemblies coined virtual chimaeras, as opposed to natural human chimaeras. We applied this framework to build virtual chimaeras from databases of whole-heart shape assemblies that each contribute samples for heart substructures. Specifically, we propose a generative shape compositional framework which comprises two components - a part-aware generative shape model which captures the variability in shape observed for each structure of interest in the training population; and a spatial composition network which assembles/composes the structures synthesised by the former into multi-part shape assemblies (viz. virtual chimaeras). We also propose a novel self supervised learning scheme that enables the spatial composition network to be trained with partially overlapping data and weak labels. We trained and validated our approach using shapes of cardiac structures derived from cardiac magnetic resonance images available in the UK Biobank. Our approach significantly outperforms a PCA-based shape model (trained with complete data) in terms of generalisability and specificity. This demonstrates the superiority of the proposed approach as the synthesised cardiac virtual populations are more plausible and capture a greater degree of variability in shape than those generated by the PCA-based shape model.
CVMar 29, 2023
Implicit Visual Bias Mitigation by Posterior Estimate Sharpening of a Bayesian Neural NetworkRebecca S Stone, Nishant Ravikumar, Andrew J Bulpitt et al.
The fairness of a deep neural network is strongly affected by dataset bias and spurious correlations, both of which are usually present in modern feature-rich and complex visual datasets. Due to the difficulty and variability of the task, no single de-biasing method has been universally successful. In particular, implicit methods not requiring explicit knowledge of bias variables are especially relevant for real-world applications. We propose a novel implicit mitigation method using a Bayesian neural network, allowing us to leverage the relationship between epistemic uncertainties and the presence of bias or spurious correlations in a sample. Our proposed posterior estimate sharpening procedure encourages the network to focus on core features that do not contribute to high uncertainties. Experimental results on three benchmark datasets demonstrate that Bayesian networks with sharpened posterior estimates perform comparably to prior existing methods and show potential worthy of further exploration.
IVAug 5, 2023
Generative Adversarial Networks for Stain Normalisation in HistopathologyJack Breen, Kieran Zucker, Katie Allen et al.
The rapid growth of digital pathology in recent years has provided an ideal opportunity for the development of artificial intelligence-based tools to improve the accuracy and efficiency of clinical diagnoses. One of the significant roadblocks to current research is the high level of visual variability across digital pathology images, causing models to generalise poorly to unseen data. Stain normalisation aims to standardise the visual profile of digital pathology images without changing the structural content of the images. In this chapter, we explore different techniques which have been used for stain normalisation in digital pathology, with a focus on approaches which utilise generative adversarial networks (GANs). Typically, GAN-based methods outperform non-generative approaches but at the cost of much greater computational requirements. However, it is not clear which method is best for stain normalisation in general, with different GAN and non-GAN approaches outperforming each other in different scenarios and according to different performance metrics. This is an ongoing field of study as researchers aim to identify a method which efficiently and effectively normalises pathology images to make AI models more robust and generalisable.
MED-PHAug 22, 2022
Three-dimensional micro-structurally informed in silico myocardium -- towards virtual imaging trials in cardiac diffusion weighted MRIMojtaba Lashgari, Nishant Ravikumar, Irvin Teh et al.
In silico tissue models enable evaluating quantitative models of magnetic resonance imaging. This includes validating and sensitivity analysis of imaging biomarkers and tissue microstructure parameters. We propose a novel method to generate a realistic numerical phantom of myocardial microstructure. We extend previous studies accounting for the cardiomyocyte shape variability, water exchange between the cardiomyocytes (intercalated discs), myocardial microstructure disarray, and four sheetlet orientations. In the first stage of the method, cardiomyocytes and sheetlets are generated by considering the shape variability and intercalated discs in cardiomyocyte-to-cardiomyocyte connections. Sheetlets are then aggregated and oriented in the directions of interest. Our morphometric study demonstrates no significant difference ($p>0.01$) between the distribution of volume, length, and primary and secondary axes of the numerical and real (literature) cardiomyocyte data. Structural correlation analysis validates that the in-silico tissue is in the same class of disorderliness as the real tissue. Additionally, the absolute angle differences between the simulated helical angle (HA) and input HA (reference value) of the cardiomyocytes ($4.3^\circ\pm 3.1^\circ$) demonstrate a good agreement with the absolute angle difference between the measured HA using experimental cardiac diffusion tensor imaging (cDTI) and histology (reference value) reported by (Holmes et al., 2000) ($3.7^\circ\pm6.4^\circ$) and (Scollan et al., 1998) ($4.9^\circ\pm 14.6^\circ$). The angular distance between eigenvectors and sheetlet angles of the input and simulated cDTI is smaller than those between measured angles using structural tensor imaging (gold standard) and experimental cDTI. These results confirm that the proposed method can generate richer numerical phantoms for the myocardium than previous studies.
IVOct 19, 2023
Predicting Ovarian Cancer Treatment Response in Histopathology using Hierarchical Vision Transformers and Multiple Instance LearningJack Breen, Katie Allen, Kieran Zucker et al.
For many patients, current ovarian cancer treatments offer limited clinical benefit. For some therapies, it is not possible to predict patients' responses, potentially exposing them to the adverse effects of treatment without any therapeutic benefit. As part of the automated prediction of treatment effectiveness in ovarian cancer using histopathological images (ATEC23) challenge, we evaluated the effectiveness of deep learning to predict whether a course of treatment including the antiangiogenic drug bevacizumab could contribute to remission or prevent disease progression for at least 6 months in a set of 282 histopathology whole slide images (WSIs) from 78 ovarian cancer patients. Our approach used a pretrained Hierarchical Image Pyramid Transformer (HIPT) to extract region-level features and an attention-based multiple instance learning (ABMIL) model to aggregate features and classify whole slides. The optimal HIPT-ABMIL model had an internal balanced accuracy of 60.2% +- 2.9% and an AUC of 0.646 +- 0.033. Histopathology-specific model pretraining was found to be beneficial to classification performance, though hierarchical transformers were not, with a ResNet feature extractor achieving similar performance. Due to the dataset being small and highly heterogeneous, performance was variable across 5-fold cross-validation folds, and there were some extreme differences between validation and test set performance within folds. The model did not generalise well to tissue microarrays, with accuracy worse than random chance. It is not yet clear whether ovarian cancer WSIs contain information that can be used to accurately predict treatment response, with further validation using larger, higher-quality datasets required.
IVJul 25, 2024
Multi-Resolution Histopathology Patch Graphs for Ovarian Cancer SubtypingJack Breen, Katie Allen, Kieran Zucker et al.
Computer vision models are increasingly capable of classifying ovarian epithelial cancer subtypes, but they differ from pathologists by processing small tissue patches at a single resolution. Multi-resolution graph models leverage the spatial relationships of patches at multiple magnifications, learning the context for each patch. In this study, we conduct the most thorough validation of a graph model for ovarian cancer subtyping to date. Seven models were tuned and trained using five-fold cross-validation on a set of 1864 whole slide images (WSIs) from 434 patients treated at Leeds Teaching Hospitals NHS Trust. The cross-validation models were ensembled and evaluated using a balanced hold-out test set of 100 WSIs from 30 patients, and an external validation set of 80 WSIs from 80 patients in the Transcanadian Study. The best-performing model, a graph model using 10x+20x magnification data, gave balanced accuracies of 73%, 88%, and 99% in cross-validation, hold-out testing, and external validation, respectively. However, this only exceeded the performance of attention-based multiple instance learning in external validation, with a 93% balanced accuracy. Graph models benefitted greatly from using the UNI foundation model rather than an ImageNet-pretrained ResNet50 for feature extraction, with this having a much greater effect on performance than changing the subsequent classification approach. The accuracy of the combined foundation model and multi-resolution graph network offers a step towards the clinical applicability of these models, with a new highest-reported performance for this task, though further validations are still required to ensure the robustness and usability of the models.
CVSep 14, 2021Code
The pitfalls of using open data to develop deep learning solutions for COVID-19 detection in chest X-raysRachael Harkness, Geoff Hall, Alejandro F Frangi et al.
Since the emergence of COVID-19, deep learning models have been developed to identify COVID-19 from chest X-rays. With little to no direct access to hospital data, the AI community relies heavily on public data comprising numerous data sources. Model performance results have been exceptional when training and testing on open-source data, surpassing the reported capabilities of AI in pneumonia-detection prior to the COVID-19 outbreak. In this study impactful models are trained on a widely used open-source data and tested on an external test set and a hospital dataset, for the task of classifying chest X-rays into one of three classes: COVID-19, non-COVID pneumonia and no-pneumonia. Classification performance of the models investigated is evaluated through ROC curves, confusion matrices and standard classification metrics. Explainability modules are implemented to explore the image features most important to classification. Data analysis and model evaluations show that the popular open-source dataset COVIDx is not representative of the real clinical problem and that results from testing on this are inflated. Dependence on open-source data can leave models vulnerable to bias and confounding variables, requiring careful analysis to develop clinically useful/viable AI tools for COVID-19 detection in chest X-rays.
IVMay 16, 2024
A Comprehensive Evaluation of Histopathology Foundation Models for Ovarian Cancer Subtype ClassificationJack Breen, Katie Allen, Kieran Zucker et al.
Large pretrained transformers are increasingly being developed as generalised foundation models which can underpin powerful task-specific artificial intelligence models. Histopathology foundation models show great promise across many tasks, but analyses have typically been limited by arbitrary hyperparameters that were not tuned to the specific task. We report the most rigorous single-task validation of histopathology foundation models to date, specifically in ovarian cancer morphological subtyping. Attention-based multiple instance learning classifiers were compared using three ImageNet-pretrained feature extractors and fourteen histopathology foundation models. The training set consisted of 1864 whole slide images from 434 ovarian carcinoma cases at Leeds Teaching Hospitals NHS Trust. Five-class classification performance was evaluated through five-fold cross-validation, and these cross-validation models were ensembled for hold-out testing and external validation on the Transcanadian Study and OCEAN Challenge datasets. The best-performing model used the H-optimus-0 foundation model, with five-class balanced accuracies of 89%, 97%, and 74% in the test sets. Normalisations and augmentations aided the performance of the ImageNet-pretrained ResNets, but these were still outperformed by 13 of the 14 foundation models. Hyperparameter tuning the downstream classifiers improved performance by a median 1.9% balanced accuracy, with many improvements being statistically significant. Histopathology foundation models offer a clear benefit to ovarian cancer subtyping, improving classification performance to a degree where clinical utility is tangible, albeit with an increased computational burden. Such models could provide a second opinion to histopathologists diagnosing challenging cases and may improve the accuracy, objectivity, and efficiency of pathological diagnoses overall.
IVOct 18, 2024
Integrating Deep Learning with Fundus and Optical Coherence Tomography for Cardiovascular Disease PredictionCynthia Maldonado-Garcia, Arezoo Zakeri, Alejandro F Frangi et al.
Early identification of patients at risk of cardiovascular diseases (CVD) is crucial for effective preventive care, reducing healthcare burden, and improving patients' quality of life. This study demonstrates the potential of retinal optical coherence tomography (OCT) imaging combined with fundus photographs for identifying future adverse cardiac events. We used data from 977 patients who experienced CVD within a 5-year interval post-image acquisition, alongside 1,877 control participants without CVD, totaling 2,854 subjects. We propose a novel binary classification network based on a Multi-channel Variational Autoencoder (MCVAE), which learns a latent embedding of patients' fundus and OCT images to classify individuals into two groups: those likely to develop CVD in the future and those who are not. Our model, trained on both imaging modalities, achieved promising results (AUROC 0.78 +/- 0.02, accuracy 0.68 +/- 0.002, precision 0.74 +/- 0.02, sensitivity 0.73 +/- 0.02, and specificity 0.68 +/- 0.01), demonstrating its efficacy in identifying patients at risk of future CVD events based on their retinal images. This study highlights the potential of retinal OCT imaging and fundus photographs as cost-effective, non-invasive alternatives for predicting cardiovascular disease risk. The widespread availability of these imaging techniques in optometry practices and hospitals further enhances their potential for large-scale CVD risk screening. Our findings contribute to the development of standardized, accessible methods for early CVD risk identification, potentially improving preventive care strategies and patient outcomes.
CVFeb 23, 2024
Unsupervised Domain Adaptation for Brain Vessel Segmentation through Transwarp Contrastive LearningFengming Lin, Yan Xia, Michael MacRaild et al.
Unsupervised domain adaptation (UDA) aims to align the labelled source distribution with the unlabelled target distribution to obtain domain-invariant predictive models. Since cross-modality medical data exhibit significant intra and inter-domain shifts and most are unlabelled, UDA is more important while challenging in medical image analysis. This paper proposes a simple yet potent contrastive learning framework for UDA to narrow the inter-domain gap between labelled source and unlabelled target distribution. Our method is validated on cerebral vessel datasets. Experimental results show that our approach can learn latent features from labelled 3DRA modality data and improve vessel segmentation performance in unlabelled MRA modality data.
CVAug 8, 2025
Can Diffusion Models Bridge the Domain Gap in Cardiac MR Imaging?Xin Ci Wong, Duygu Sarikaya, Kieran Zucker et al.
Magnetic resonance (MR) imaging, including cardiac MR, is prone to domain shift due to variations in imaging devices and acquisition protocols. This challenge limits the deployment of trained AI models in real-world scenarios, where performance degrades on unseen domains. Traditional solutions involve increasing the size of the dataset through ad-hoc image augmentation or additional online training/transfer learning, which have several limitations. Synthetic data offers a promising alternative, but anatomical/structural consistency constraints limit the effectiveness of generative models in creating image-label pairs. To address this, we propose a diffusion model (DM) trained on a source domain that generates synthetic cardiac MR images that resemble a given reference. The synthetic data maintains spatial and structural fidelity, ensuring similarity to the source domain and compatibility with the segmentation mask. We assess the utility of our generative approach in multi-centre cardiac MR segmentation, using the 2D nnU-Net, 3D nnU-Net and vanilla U-Net segmentation networks. We explore domain generalisation, where, domain-invariant segmentation models are trained on synthetic source domain data, and domain adaptation, where, we shift target domain data towards the source domain using the DM. Both strategies significantly improved segmentation performance on data from an unseen target domain, in terms of surface-based metrics (Welch's t-test, p < 0.01), compared to training segmentation models on real data alone. The proposed method ameliorates the need for transfer learning or online training to address domain shift challenges in cardiac MR image analysis, especially useful in data-scarce settings.
IVApr 17, 2025
Putting the Segment Anything Model to the Test with 3D Knee MRI - A Comparison with State-of-the-Art PerformanceOliver Mills, Philip Conaghan, Nishant Ravikumar et al.
Menisci are cartilaginous tissue found within the knee that contribute to joint lubrication and weight dispersal. Damage to menisci can lead to onset and progression of knee osteoarthritis (OA), a condition that is a leading cause of disability, and for which there are few effective therapies. Accurate automated segmentation of menisci would allow for earlier detection and treatment of meniscal abnormalities, as well as shedding more light on the role the menisci play in OA pathogenesis. Focus in this area has mainly used variants of convolutional networks, but there has been no attempt to utilise recent large vision transformer segmentation models. The Segment Anything Model (SAM) is a so-called foundation segmentation model, which has been found useful across a range of different tasks due to the large volume of data used for training the model. In this study, SAM was adapted to perform fully-automated segmentation of menisci from 3D knee magnetic resonance images. A 3D U-Net was also trained as a baseline. It was found that, when fine-tuning only the decoder, SAM was unable to compete with 3D U-Net, achieving a Dice score of $0.81\pm0.03$, compared to $0.87\pm0.03$, on a held-out test set. When fine-tuning SAM end-to-end, a Dice score of $0.87\pm0.03$ was achieved. The performance of both the end-to-end trained SAM configuration and the 3D U-Net were comparable to the winning Dice score ($0.88\pm0.03$) in the IWOAI Knee MRI Segmentation Challenge 2019. Performance in terms of the Hausdorff Distance showed that both configurations of SAM were inferior to 3D U-Net in matching the meniscus morphology. Results demonstrated that, despite its generalisability, SAM was unable to outperform a basic 3D U-Net in meniscus segmentation, and may not be suitable for similar 3D medical image segmentation tasks also involving fine anatomical structures with low contrast and poorly-defined boundaries.
CVFeb 23, 2024
GS-EMA: Integrating Gradient Surgery Exponential Moving Average with Boundary-Aware Contrastive Learning for Enhanced Domain Generalization in Aneurysm SegmentationFengming Lin, Yan Xia, Michael MacRaild et al.
The automated segmentation of cerebral aneurysms is pivotal for accurate diagnosis and treatment planning. Confronted with significant domain shifts and class imbalance in 3D Rotational Angiography (3DRA) data from various medical institutions, the task becomes challenging. These shifts include differences in image appearance, intensity distribution, resolution, and aneurysm size, all of which complicate the segmentation process. To tackle these issues, we propose a novel domain generalization strategy that employs gradient surgery exponential moving average (GS-EMA) optimization technique coupled with boundary-aware contrastive learning (BACL). Our approach is distinct in its ability to adapt to new, unseen domains by learning domain-invariant features, thereby improving the robustness and accuracy of aneurysm segmentation across diverse clinical datasets. The results demonstrate that our proposed approach can extract more domain-invariant features, minimizing over-segmentation and capturing more complete aneurysm structures.
IVJan 24, 2024
Segmenting Cardiac Muscle Z-disks with Deep Neural NetworksMihaela Croitor Ibrahim, Nishant Ravikumar, Alistair Curd et al.
Z-disks are complex structures that delineate repeating sarcomeres in striated muscle. They play significant roles in cardiomyocytes such as providing mechanical stability for the contracting sarcomere, cell signalling and autophagy. Changes in Z-disk architecture have been associated with impaired cardiac function. Hence, there is a strong need to create tools to segment Z-disks from microscopy images, that overcome traditional limitations such as variability in image brightness and staining technique. In this study, we apply deep learning based segmentation models to extract Z-disks in images of striated muscle tissue. We leverage a novel Airyscan confocal dataset, which comprises high resolution images of Z-disks of healthy heart tissue, stained with Affimers for specific Z-disk proteins. We employed an interactive labelling tool, Ilastik to obtain ground truth segmentation masks and use the resulting data set to train and evaluate the performance of several state-of-the-art segmentation networks. On the test set, UNet++ achieves best segmentation performance for Z-disks in cardiomyocytes, with an average Dice score of 0.91 and outperforms other established segmentation methods including UNet, FPN, DeepLabv3+ and pix2pix. However, pix2pix demonstrates improved generalisation, when tested on an additional dataset of cardiomyocytes with a titin mutation. This is the first study to demonstrate that automated machine learning-based segmentation approaches may be used effectively to segment Z-disks in confocal microscopy images. Automated segmentation approaches and predicted segmentation masks could be used to derive morphological features of Z-disks (e.g. width and orientation), and subsequently, to quantify disease-related changes to cardiac microstructure.
IVSep 1, 2021
Assessing domain adaptation techniques for mitosis detection in multi-scanner breast cancer histopathology imagesJack Breen, Kieran Zucker, Nicolas M. Orsi et al.
Breast cancer is the most commonly diagnosed cancer worldwide, with over two million new cases each year. During diagnostic tumour grading, pathologists manually count the number of dividing cells (mitotic figures) in biopsy or tumour resection specimens. Since the process is subjective and time-consuming, data-driven artificial intelligence (AI) methods have been developed to automatically detect mitotic figures. However, these methods often generalise poorly, with performance reduced by variations in tissue types, staining protocols, or the scanners used to digitise whole-slide images. Domain adaptation approaches have been adopted in various applications to mitigate this issue of domain shift. We evaluate two unsupervised domain adaptation methods, CycleGAN and Neural Style Transfer, using the MIDOG 2021 Challenge dataset. This challenge focuses on detecting mitotic figures in whole-slide images digitised using different scanners. Two baseline mitosis detection models based on U-Net and RetinaNet were investigated in combination with the aforementioned domain adaptation methods. Both baseline models achieved human expert level performance, but had reduced performance when evaluated on images which had been digitised using a different scanner. The domain adaptation techniques were each found to be beneficial for detection with data from some scanners but not for others, with the only average increase across all scanners being achieved by CycleGAN on the RetinaNet detector. These techniques require further refinement to ensure consistency in mitosis detection.
CVAug 11, 2021
Statistical Dependency Guided Contrastive Learning for Multiple Labeling in Prenatal UltrasoundShuangchi He, Zehui Lin, Xin Yang et al.
Standard plane recognition plays an important role in prenatal ultrasound (US) screening. Automatically recognizing the standard plane along with the corresponding anatomical structures in US image can not only facilitate US image interpretation but also improve diagnostic efficiency. In this study, we build a novel multi-label learning (MLL) scheme to identify multiple standard planes and corresponding anatomical structures of fetus simultaneously. Our contribution is three-fold. First, we represent the class correlation by word embeddings to capture the fine-grained semantic and latent statistical concurrency. Second, we equip the MLL with a graph convolutional network to explore the inner and outer relationship among categories. Third, we propose a novel cluster relabel-based contrastive learning algorithm to encourage the divergence among ambiguous classes. Extensive validation was performed on our large in-house dataset. Our approach reports the highest accuracy as 90.25% for standard planes labeling, 85.59% for planes and structures labeling and mAP as 94.63%. The proposed MLL scheme provides a novel perspective for standard plane recognition and can be easily extended to other medical image classification tasks.
CVAug 2, 2021
Flip Learning: Erase to SegmentYuhao Huang, Xin Yang, Yuxin Zou et al.
Nodule segmentation from breast ultrasound images is challenging yet essential for the diagnosis. Weakly-supervised segmentation (WSS) can help reduce time-consuming and cumbersome manual annotation. Unlike existing weakly-supervised approaches, in this study, we propose a novel and general WSS framework called Flip Learning, which only needs the box annotation. Specifically, the target in the label box will be erased gradually to flip the classification tag, and the erased region will be considered as the segmentation result finally. Our contribution is three-fold. First, our proposed approach erases on superpixel level using a Multi-agent Reinforcement Learning framework to exploit the prior boundary knowledge and accelerate the learning process. Second, we design two rewards: classification score and intensity distribution reward, to avoid under- and over-segmentation, respectively. Third, we adopt a coarse-to-fine learning strategy to reduce the residual errors and improve the segmentation performance. Extensively validated on a large dataset, our proposed approach achieves competitive performance and shows great potential to narrow the gap between fully-supervised and weakly-supervised learning.
IVAug 1, 2021
Style Curriculum Learning for Robust Medical Image SegmentationZhendong Liu, Van Manh, Xin Yang et al.
The performance of deep segmentation models often degrades due to distribution shifts in image intensities between the training and test data sets. This is particularly pronounced in multi-centre studies involving data acquired using multi-vendor scanners, with variations in acquisition protocols. It is challenging to address this degradation because the shift is often not known \textit{a priori} and hence difficult to model. We propose a novel framework to ensure robust segmentation in the presence of such distribution shifts. Our contribution is three-fold. First, inspired by the spirit of curriculum learning, we design a novel style curriculum to train the segmentation models using an easy-to-hard mode. A style transfer model with style fusion is employed to generate the curriculum samples. Gradually focusing on complex and adversarial style samples can significantly boost the robustness of the models. Second, instead of subjectively defining the curriculum complexity, we adopt an automated gradient manipulation method to control the hard and adversarial sample generation process. Third, we propose the Local Gradient Sign strategy to aggregate the gradient locally and stabilise training during gradient manipulation. The proposed framework can generalise to unknown distribution without using any target data. Extensive experiments on the public M\&Ms Challenge dataset demonstrate that our proposed framework can generalise deep models well to unknown distributions and achieve significant improvements in segmentation accuracy.
CVJul 31, 2021
Self Context and Shape Prior for Sensorless Freehand 3D Ultrasound ReconstructionMingyuan Luo, Xin Yang, Xiaoqiong Huang et al.
3D ultrasound (US) is widely used for its rich diagnostic information. However, it is criticized for its limited field of view. 3D freehand US reconstruction is promising in addressing the problem by providing broad range and freeform scan. The existing deep learning based methods only focus on the basic cases of skill sequences, and the model relies on the training data heavily. The sequences in real clinical practice are a mix of diverse skills and have complex scanning paths. Besides, deep models should adapt themselves to the testing cases with prior knowledge for better robustness, rather than only fit to the training cases. In this paper, we propose a novel approach to sensorless freehand 3D US reconstruction considering the complex skill sequences. Our contribution is three-fold. First, we advance a novel online learning framework by designing a differentiable reconstruction algorithm. It realizes an end-to-end optimization from section sequences to the reconstructed volume. Second, a self-supervised learning method is developed to explore the context information that reconstructed by the testing data itself, promoting the perception of the model. Third, inspired by the effectiveness of shape prior, we also introduce adversarial training to strengthen the learning of anatomical shape prior in the reconstructed volume. By mining the context and structural cues of the testing data, our online learning methods can drive the model to handle complex skill sequences. Experimental results on developmental dysplasia of the hip US and fetal US datasets show that, our proposed method can outperform the start-of-the-art methods regarding the shift errors and path similarities.
IVJul 9, 2021
A Deep Discontinuity-Preserving Image Registration NetworkXiang Chen, Nishant Ravikumar, Yan Xia et al.
Image registration aims to establish spatial correspondence across pairs, or groups of images, and is a cornerstone of medical image computing and computer-assisted-interventions. Currently, most deep learning-based registration methods assume that the desired deformation fields are globally smooth and continuous, which is not always valid for real-world scenarios, especially in medical image registration (e.g. cardiac imaging and abdominal imaging). Such a global constraint can lead to artefacts and increased errors at discontinuous tissue interfaces. To tackle this issue, we propose a weakly-supervised Deep Discontinuity-preserving Image Registration network (DDIR), to obtain better registration performance and realistic deformation fields. We demonstrate that our method achieves significant improvements in registration accuracy and predicts more realistic deformations, in registration experiments on cardiac magnetic resonance (MR) images from UK Biobank Imaging Study (UKBB), than state-of-the-art approaches.
IVJun 11, 2021
CAR-Net: Unsupervised Co-Attention Guided Registration Network for Joint Registration and Structure LearningXiang Chen, Yan Xia, Nishant Ravikumar et al.
Image registration is a fundamental building block for various applications in medical image analysis. To better explore the correlation between the fixed and moving images and improve registration performance, we propose a novel deep learning network, Co-Attention guided Registration Network (CAR-Net). CAR-Net employs a co-attention block to learn a new representation of the inputs, which drives the registration of the fixed and moving images. Experiments on UK Biobank cardiac cine-magnetic resonance image data demonstrate that CAR-Net obtains higher registration accuracy and smoother deformation fields than state-of-the-art unsupervised registration methods, while achieving comparable or better registration performance than corresponding weakly-supervised variants. In addition, our approach can provide critical structural information of the input fixed and moving images simultaneously in a completely unsupervised manner.
CVMar 15, 2021
Adapt Everywhere: Unsupervised Adaptation of Point-Clouds and Entropy Minimisation for Multi-modal Cardiac Image SegmentationSulaiman Vesal, Mingxuan Gu, Ronak Kosti et al.
Deep learning models are sensitive to domain shift phenomena. A model trained on images from one domain cannot generalise well when tested on images from a different domain, despite capturing similar anatomical structures. It is mainly because the data distribution between the two domains is different. Moreover, creating annotation for every new modality is a tedious and time-consuming task, which also suffers from high inter- and intra- observer variability. Unsupervised domain adaptation (UDA) methods intend to reduce the gap between source and target domains by leveraging source domain labelled data to generate labels for the target domain. However, current state-of-the-art (SOTA) UDA methods demonstrate degraded performance when there is insufficient data in source and target domains. In this paper, we present a novel UDA method for multi-modal cardiac image segmentation. The proposed method is based on adversarial learning and adapts network features between source and target domain in different spaces. The paper introduces an end-to-end framework that integrates: a) entropy minimisation, b) output feature space alignment and c) a novel point-cloud shape adaptation based on the latent features learned by the segmentation model. We validated our method on two cardiac datasets by adapting from the annotated source domain, bSSFP-MRI (balanced Steady-State Free Procession-MRI), to the unannotated target domain, LGE-MRI (Late-gadolinium enhance-MRI), for the multi-sequence dataset; and from MRI (source) to CT (target) for the cross-modality dataset. The results highlighted that by enforcing adversarial learning in different parts of the network, the proposed method delivered promising performance, compared to other SOTA methods.
IVDec 24, 2020
Spatio-temporal Multi-task Learning for Cardiac MRI Left Ventricle QuantificationSulaiman Vesal, Mingxuan Gu, Andreas Maier et al.
Quantitative assessment of cardiac left ventricle (LV) morphology is essential to assess cardiac function and improve the diagnosis of different cardiovascular diseases. In current clinical practice, LV quantification depends on the measurement of myocardial shape indices, which is usually achieved by manual contouring of the endo- and epicardial. However, this process subjected to inter and intra-observer variability, and it is a time-consuming and tedious task. In this paper, we propose a spatio-temporal multi-task learning approach to obtain a complete set of measurements quantifying cardiac LV morphology, regional-wall thickness (RWT), and additionally detecting the cardiac phase cycle (systole and diastole) for a given 3D Cine-magnetic resonance (MR) image sequence. We first segment cardiac LVs using an encoder-decoder network and then introduce a multitask framework to regress 11 LV indices and classify the cardiac phase, as parallel tasks during model optimization. The proposed deep learning model is based on the 3D spatio-temporal convolutions, which extract spatial and temporal features from MR images. We demonstrate the efficacy of the proposed method using cine-MR sequences of 145 subjects and comparing the performance with other state-of-the-art quantification methods. The proposed method obtained high prediction accuracy, with an average mean absolute error (MAE) of 129 $mm^2$, 1.23 $mm$, 1.76 $mm$, Pearson correlation coefficient (PCC) of 96.4%, 87.2%, and 97.5% for LV and myocardium (Myo) cavity regions, 6 RWTs, 3 LV dimensions, and an error rate of 9.0\% for phase classification. The experimental results highlight the robustness of the proposed method, despite varying degrees of cardiac morphology, image appearance, and low contrast in the cardiac MR sequences.
CVSep 1, 2020
Fed-Sim: Federated Simulation for Medical ImagingDaiqing Li, Amlan Kar, Nishant Ravikumar et al.
Labelling data is expensive and time consuming especially for domains such as medical imaging that contain volumetric imaging data and require expert knowledge. Exploiting a larger pool of labeled data available across multiple centers, such as in federated learning, has also seen limited success since current deep learning approaches do not generalize well to images acquired with scanners from different manufacturers. We aim to address these problems in a common, learning-based image simulation framework which we refer to as Federated Simulation. We introduce a physics-driven generative approach that consists of two learnable neural modules: 1) a module that synthesizes 3D cardiac shapes along with their materials, and 2) a CT simulator that renders these into realistic 3D CT Volumes, with annotations. Since the model of geometry and material is disentangled from the imaging sensor, it can effectively be trained across multiple medical centers. We show that our data synthesis framework improves the downstream segmentation performance on several datasets. Project Page: https://nv-tlabs.github.io/fed-sim/ .
LGJul 6, 2020
Partially Conditioned Generative Adversarial NetworksFrancisco J. Ibarrola, Nishant Ravikumar, Alejandro F. Frangi
Generative models are undoubtedly a hot topic in Artificial Intelligence, among which the most common type is Generative Adversarial Networks (GANs). These architectures let one synthesise artificial datasets by implicitly modelling the underlying probability distribution of a real-world training dataset. With the introduction of Conditional GANs and their variants, these methods were extended to generating samples conditioned on ancillary information available for each sample within the dataset. From a practical standpoint, however, one might desire to generate data conditioned on partial information. That is, only a subset of the ancillary conditioning variables might be of interest when synthesising data. In this work, we argue that standard Conditional GANs are not suitable for such a task and propose a new Adversarial Network architecture and training strategy to deal with the ensuing problems. Experiments illustrating the value of the proposed approach in digit and face image synthesis under partial conditioning information are presented, showing that the proposed method can effectively outperform the standard approach under these circumstances.
IVJun 22, 2020
Cardiac Segmentation on Late Gadolinium Enhancement MRI: A Benchmark Study from Multi-Sequence Cardiac MR Segmentation ChallengeXiahai Zhuang, Jiahang Xu, Xinzhe Luo et al.
Accurate computing, analysis and modeling of the ventricles and myocardium from medical images are important, especially in the diagnosis and treatment management for patients suffering from myocardial infarction (MI). Late gadolinium enhancement (LGE) cardiac magnetic resonance (CMR) provides an important protocol to visualize MI. However, automated segmentation of LGE CMR is still challenging, due to the indistinguishable boundaries, heterogeneous intensity distribution and complex enhancement patterns of pathological myocardium from LGE CMR. Furthermore, compared with the other sequences LGE CMR images with gold standard labels are particularly limited, which represents another obstacle for developing novel algorithms for automatic segmentation of LGE CMR. This paper presents the selective results from the Multi-Sequence Cardiac MR (MS-CMR) Segmentation challenge, in conjunction with MICCAI 2019. The challenge offered a data set of paired MS-CMR images, including auxiliary CMR sequences as well as LGE CMR, from 45 patients who underwent cardiomyopathy. It was aimed to develop new algorithms, as well as benchmark existing ones for LGE CMR segmentation and compare them objectively. In addition, the paired MS-CMR images could enable algorithms to combine the complementary information from the other sequences for the segmentation of LGE CMR. Nine representative works were selected for evaluation and comparisons, among which three methods are unsupervised methods and the other six are supervised. The results showed that the average performance of the nine methods was comparable to the inter-observer variations. The success of these methods was mainly attributed to the inclusion of the auxiliary sequences from the MS-CMR images, which provide important label information for the training of deep neural networks.
CVApr 26, 2020
A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance ImagingZhaohan Xiong, Qing Xia, Zhiqiang Hu et al.
Segmentation of cardiac images, particularly late gadolinium-enhanced magnetic resonance imaging (LGE-MRI) widely used for visualizing diseased cardiac structures, is a crucial first step for clinical diagnosis and treatment. However, direct segmentation of LGE-MRIs is challenging due to its attenuated contrast. Since most clinical studies have relied on manual and labor-intensive approaches, automatic methods are of high interest, particularly optimized machine learning approaches. To address this, we organized the "2018 Left Atrium Segmentation Challenge" using 154 3D LGE-MRIs, currently the world's largest cardiac LGE-MRI dataset, and associated labels of the left atrium segmented by three medical experts, ultimately attracting the participation of 27 international teams. In this paper, extensive analysis of the submitted algorithms using technical and biological metrics was performed by undergoing subgroup analysis and conducting hyper-parameter analysis, offering an overall picture of the major design choices of convolutional neural networks (CNNs) and practical considerations for achieving state-of-the-art left atrium segmentation. Results show the top method achieved a dice score of 93.2% and a mean surface to a surface distance of 0.7 mm, significantly outperforming prior state-of-the-art. Particularly, our analysis demonstrated that double, sequentially used CNNs, in which a first CNN is used for automatic region-of-interest localization and a subsequent CNN is used for refined regional segmentation, achieved far superior results than traditional methods and pipelines containing single CNNs. This large-scale benchmarking study makes a significant step towards much-improved segmentation methods for cardiac LGE-MRIs, and will serve as an important benchmark for evaluating and comparing the future works in the field.
SPFeb 7, 2020
The Effect of Data Augmentation on Classification of Atrial Fibrillation in Short Single-Lead ECG Signals Using Deep Neural NetworksFaezeh Nejati Hatamian, Nishant Ravikumar, Sulaiman Vesal et al.
Cardiovascular diseases are the most common cause of mortality worldwide. Detection of atrial fibrillation (AF) in the asymptomatic stage can help prevent strokes. It also improves clinical decision making through the delivery of suitable treatment such as, anticoagulant therapy, in a timely manner. The clinical significance of such early detection of AF in electrocardiogram (ECG) signals has inspired numerous studies in recent years, of which many aim to solve this task by leveraging machine learning algorithms. ECG datasets containing AF samples, however, usually suffer from severe class imbalance, which if unaccounted for, affects the performance of classification algorithms. Data augmentation is a popular solution to tackle this problem. In this study, we investigate the impact of various data augmentation algorithms, e.g., oversampling, Gaussian Mixture Models (GMMs) and Generative Adversarial Networks (GANs), on solving the class imbalance problem. These algorithms are quantitatively and qualitatively evaluated, compared and discussed in detail. The results show that deep learning-based AF signal classification methods benefit more from data augmentation using GANs and GMMs, than oversampling. Furthermore, the GAN results in circa $3\%$ better AF classification accuracy in average while performing comparably to the GMM in terms of f1-score.
IVJan 4, 2020
COPD Classification in CT Images Using a 3D Convolutional Neural NetworkJalil Ahmed, Sulaiman Vesal, Felix Durlak et al.
Chronic obstructive pulmonary disease (COPD) is a lung disease that is not fully reversible and one of the leading causes of morbidity and mortality in the world. Early detection and diagnosis of COPD can increase the survival rate and reduce the risk of COPD progression in patients. Currently, the primary examination tool to diagnose COPD is spirometry. However, computed tomography (CT) is used for detecting symptoms and sub-type classification of COPD. Using different imaging modalities is a difficult and tedious task even for physicians and is subjective to inter-and intra-observer variations. Hence, developing meth-ods that can automatically classify COPD versus healthy patients is of great interest. In this paper, we propose a 3D deep learning approach to classify COPD and emphysema using volume-wise annotations only. We also demonstrate the impact of transfer learning on the classification of emphysema using knowledge transfer from a pre-trained COPD classification model.
IVDec 19, 2019
Analyzing an Imitation Learning Network for Fundus Image Registration Using a Divide-and-Conquer ApproachSiming Bayer, Xia Zhong, Weilin Fu et al.
Comparison of microvascular circulation on fundoscopic images is a non-invasive clinical indication for the diagnosis and monitoring of diseases, such as diabetes and hypertensions. The differences between intra-patient images can be assessed quantitatively by registering serial acquisitions. Due to the variability of the images (i.e. contrast, luminosity) and the anatomical changes of the retina, the registration of fundus images remains a challenging task. Recently, several deep learning approaches have been proposed to register fundus images in an end-to-end fashion, achieving remarkable results. However, the results are difficult to interpret and analyze. In this work, we propose an imitation learning framework for the registration of 2D color funduscopic images for a wide range of applications such as disease monitoring, image stitching and super-resolution. We follow a divide-and-conquer approach to improve the interpretability of the proposed network, and analyze both the influence of the input image and the hyperparameters on the registration result. The results show that the proposed registration network reduces the initial target registration error up to 95\%.
IVDec 12, 2019
Coronary Artery Plaque Characterization from CCTA Scans using Deep Learning and RadiomicsFelix Denzinger, Michael Wels, Nishant Ravikumar et al.
Assessing coronary artery plaque segments in coronary CT angiography scans is an important task to improve patient management and clinical outcomes, as it can help to decide whether invasive investigation and treatment are necessary. In this work, we present three machine learning approaches capable of performing this task. The first approach is based on radiomics, where a plaque segmentation is used to calculate various shape-, intensity- and texture-based features under different image transformations. A second approach is based on deep learning and relies on centerline extraction as sole prerequisite. In the third approach, we fuse the deep learning approach with radiomic features. On our data the methods reached similar scores as simulated fractional flow reserve (FFR) measurements, which - in contrast to our methods - requires an exact segmentation of the whole coronary tree and often time-consuming manual interaction. In literature, the performance of simulated FFR reaches an AUC between 0.79-0.93 predicting an abnormal invasive FFR that demands revascularization. The radiomics approach achieves an AUC of 0.86, the deep learning approach 0.84 and the combined method 0.88 for predicting the revascularization decision directly. While all three proposed methods can be determined within seconds, the FFR simulation typically takes several minutes. Provided representative training data in sufficient quantities, we believe that the presented methods can be used to create systems for fully automatic non-invasive risk assessment for a variety of adverse cardiac events.
IVAug 21, 2019
Automated Multi-sequence Cardiac MRI Segmentation Using Supervised Domain AdaptationSulaiman Vesal, Nishant Ravikumar, Andreas Maier
Left ventricle segmentation and morphological assessment are essential for improving diagnosis and our understanding of cardiomyopathy, which in turn is imperative for reducing risk of myocardial infarctions in patients. Convolutional neural network (CNN) based methods for cardiac magnetic resonance (CMR) image segmentation rely on supervision with pixel-level annotations, and may not generalize well to images from a different domain. These methods are typically sensitive to variations in imaging protocols and data acquisition. Since annotating multi-sequence CMR images is tedious and subject to inter- and intra-observer variations, developing methods that can automatically adapt from one domain to the target domain is of great interest. In this paper, we propose an approach for domain adaptation in multi-sequence CMR segmentation task using transfer learning that combines multi-source image information. We first train an encoder-decoder CNN on T2-weighted and balanced-Steady State Free Precession (bSSFP) MR images with pixel-level annotation and fine-tune the same network with a limited number of Late Gadolinium Enhanced-MR (LGE-MR) subjects, to adapt the domain features. The domain-adapted network was trained with just four LGE-MR training samples and obtained an average Dice score of $\sim$85.0\% on the test set comprises of 40 LGE-MR subjects. The proposed method significantly outperformed a network without adaptation trained from scratch on the same set of LGE-MR training data.
LGJul 14, 2019
A Divide-and-Conquer Approach towards Understanding Deep NetworksWeilin Fu, Katharina Breininger, Roman Schaffert et al.
Deep neural networks have achieved tremendous success in various fields including medical image segmentation. However, they have long been criticized for being a black-box, in that interpretation, understanding and correcting architectures is difficult as there is no general theory for deep neural network design. Previously, precision learning was proposed to fuse deep architectures and traditional approaches. Deep networks constructed in this way benefit from the original known operator, have fewer parameters, and improved interpretability. However, they do not yield state-of-the-art performance in all applications. In this paper, we propose to analyze deep networks using known operators, by adopting a divide-and-conquer strategy to replace network components, whilst retaining its performance. The task of retinal vessel segmentation is investigated for this purpose. We start with a high-performance U-Net and show by step-by-step conversion that we are able to divide the network into modules of known operators. The results indicate that a combination of a trainable guided filter and a trainable version of the Frangi filter yields a performance at the level of U-Net (AUC 0.974 vs. 0.972) with a tremendous reduction in parameters (111,536 vs. 9,575). In addition, the trained layers can be mapped back into their original algorithmic interpretation and analyzed using standard tools of signal processing.