CVMay 28
Controllable Lung Nodule Synthesis via Histogram-Regularized Latent Diffusion ModelsArunkumar Kannan, Yanbo Zhang, Han Liu et al.
While automated diagnosis systems have achieved remarkable success in computed tomography (CT)-based lung cancer screening, their development remains limited by the scarcity of diverse, annotated pulmonary nodule datasets. Diffusion-based generative models offer a promising strategy for data synthesis; however, many existing conditional approaches primarily optimize spatial reconstruction losses, which encourage voxel-wise similarity but may inadequately constrain lesion-level intensity distributions. As a result, these methods may produce over-smoothed texture profiles and underrepresent the distinct attenuation characteristics of different nodule subtypes, including solid, part-solid, and ground-glass nodules. To address this challenge, we propose a controllable latent diffusion model that synthesizes pulmonary nodules within full 3D CT volumes while accurately modeling nodule-specific intensity distributions. Specifically, rather than relying solely on spatial losses, we introduce a histogram-based regularization term that constrains voxel intensity distributions during the generative process. The model combines subtype, spatial mask, and Hounsfield unit (HU) histogram conditioning with the differentiable feature-space histogram regularization term to better align lesion-level intensity distributions, improving the visual plausibility and subtype consistency of synthesized nodules. Extensive experiments on lung CT data demonstrate that our framework achieves strong visual realism, validated through both quantitative metrics and a visual Turing test. Furthermore, when used for data augmentation, the generated nodules improve performance in downstream clinical tasks, particularly for underrepresented nodule subtypes, and show a potential benefit for subtype-informed malignancy classification.
CVJun 18, 2023
Generation of Radiology Findings in Chest X-Ray by Leveraging Collaborative KnowledgeManuela Daniela Danu, George Marica, Sanjeev Kumar Karn et al.
Among all the sub-sections in a typical radiology report, the Clinical Indications, Findings, and Impression often reflect important details about the health status of a patient. The information included in Impression is also often covered in Findings. While Findings and Impression can be deduced by inspecting the image, Clinical Indications often require additional context. The cognitive task of interpreting medical images remains the most critical and often time-consuming step in the radiology workflow. Instead of generating an end-to-end radiology report, in this paper, we focus on generating the Findings from automated interpretation of medical images, specifically chest X-rays (CXRs). Thus, this work focuses on reducing the workload of radiologists who spend most of their time either writing or narrating the Findings. Unlike past research, which addresses radiology report generation as a single-step image captioning task, we have further taken into consideration the complexity of interpreting CXR images and propose a two-step approach: (a) detecting the regions with abnormalities in the image, and (b) generating relevant text for regions with abnormalities by employing a generative large language model (LLM). This two-step approach introduces a layer of interpretability and aligns the framework with the systematic reasoning that radiologists use when reviewing a CXR.
CVApr 27, 2023
COSST: Multi-organ Segmentation with Partially Labeled Datasets Using Comprehensive Supervisions and Self-trainingHan Liu, Zhoubing Xu, Riqiang Gao et al.
Deep learning models have demonstrated remarkable success in multi-organ segmentation but typically require large-scale datasets with all organs of interest annotated. However, medical image datasets are often low in sample size and only partially labeled, i.e., only a subset of organs are annotated. Therefore, it is crucial to investigate how to learn a unified model on the available partially labeled datasets to leverage their synergistic potential. In this paper, we systematically investigate the partial-label segmentation problem with theoretical and empirical analyses on the prior techniques. We revisit the problem from a perspective of partial label supervision signals and identify two signals derived from ground truth and one from pseudo labels. We propose a novel two-stage framework termed COSST, which effectively and efficiently integrates comprehensive supervision signals with self-training. Concretely, we first train an initial unified model using two ground truth-based signals and then iteratively incorporate the pseudo label signal to the initial model using self-training. To mitigate performance degradation caused by unreliable pseudo labels, we assess the reliability of pseudo labels via outlier detection in latent space and exclude the most unreliable pseudo labels from each self-training iteration. Extensive experiments are conducted on one public and three private partial-label segmentation tasks over 12 CT datasets. Experimental results show that our proposed COSST achieves significant improvement over the baseline method, i.e., individual networks trained on each partially labeled dataset. Compared to the state-of-the-art partial-label segmentation methods, COSST demonstrates consistent superior performance on various segmentation tasks and with different training data sizes.
CVDec 15, 2025
Revisiting 2D Foundation Models for Scalable 3D Medical Image ClassificationHan Liu, Bogdan Georgescu, Yanbo Zhang et al.
3D medical image classification is essential for modern clinical workflows. Medical foundation models (FMs) have emerged as a promising approach for scaling to new tasks, yet current research suffers from three critical pitfalls: data-regime bias, suboptimal adaptation, and insufficient task coverage. In this paper, we address these pitfalls and introduce AnyMC3D, a scalable 3D classifier adapted from 2D FMs. Our method scales efficiently to new tasks by adding only lightweight plugins (about 1M parameters per task) on top of a single frozen backbone. This versatile framework also supports multi-view inputs, auxiliary pixel-level supervision, and interpretable heatmap generation. We establish a comprehensive benchmark of 12 tasks covering diverse pathologies, anatomies, and modalities, and systematically analyze state-of-the-art 3D classification techniques. Our analysis reveals key insights: (1) effective adaptation is essential to unlock FM potential, (2) general-purpose FMs can match medical-specific FMs if properly adapted, and (3) 2D-based methods surpass 3D architectures for 3D classification. For the first time, we demonstrate the feasibility of achieving state-of-the-art performance across diverse applications using a single scalable framework (including 1st place in the VLM3D challenge), eliminating the need for separate task-specific models.
CVMar 13, 2025Code
AI-assisted Early Detection of Pancreatic Ductal Adenocarcinoma on Contrast-enhanced CTHan Liu, Riqiang Gao, Sasa Grbic
Pancreatic ductal adenocarcinoma (PDAC) is one of the most common and aggressive types of pancreatic cancer. However, due to the lack of early and disease-specific symptoms, most patients with PDAC are diagnosed at an advanced disease stage. Consequently, early PDAC detection is crucial for improving patients' quality of life and expanding treatment options. In this work, we develop a coarse-to-fine approach to detect PDAC on contrast-enhanced CT scans. First, we localize and crop the region of interest from the low-resolution images, and then segment the PDAC-related structures at a finer scale. Additionally, we introduce two strategies to further boost detection performance: (1) a data-splitting strategy for model ensembling, and (2) a customized post-processing function. We participated in the PANORAMA challenge and ranked 1st place for PDAC detection with an AUROC of 0.9263 and an AP of 0.7243. Our code and models are publicly available at https://github.com/han-liu/PDAC_detection.
IVFeb 28, 2025
A Non-contrast Head CT Foundation Model for Comprehensive Neuro-Trauma TriageYoungjin Yoo, Bogdan Georgescu, Yanbo Zhang et al.
Recent advancements in AI and medical imaging offer transformative potential in emergency head CT interpretation for reducing assessment times and improving accuracy in the face of an increasing request of such scans and a global shortage in radiologists. This study introduces a 3D foundation model for detecting diverse neuro-trauma findings with high accuracy and efficiency. Using large language models (LLMs) for automatic labeling, we generated comprehensive multi-label annotations for critical conditions. Our approach involved pretraining neural networks for hemorrhage subtype segmentation and brain anatomy parcellation, which were integrated into a pretrained comprehensive neuro-trauma detection network through multimodal fine-tuning. Performance evaluation against expert annotations and comparison with CT-CLIP demonstrated strong triage accuracy across major neuro-trauma findings, such as hemorrhage and midline shift, as well as less frequent critical conditions such as cerebral edema and arterial hyperdensity. The integration of neuro-specific features significantly enhanced diagnostic capabilities, achieving an average AUC of 0.861 for 16 neuro-trauma conditions. This work advances foundation models in medical imaging, serving as a benchmark for future AI-assisted neuro-trauma diagnostics in emergency radiology.
CVJul 25, 2025
Exemplar Med-DETR: Toward Generalized and Robust Lesion Detection in Mammogram Images and beyondSheethal Bhat, Bogdan Georgescu, Adarsh Bhandary Panambur et al.
Detecting abnormalities in medical images poses unique challenges due to differences in feature representations and the intricate relationship between anatomical structures and abnormalities. This is especially evident in mammography, where dense breast tissue can obscure lesions, complicating radiological interpretation. Despite leveraging anatomical and semantic context, existing detection methods struggle to learn effective class-specific features, limiting their applicability across different tasks and imaging modalities. In this work, we introduce Exemplar Med-DETR, a novel multi-modal contrastive detector that enables feature-based detection. It employs cross-attention with inherently derived, intuitive class-specific exemplar features and is trained with an iterative strategy. We achieve state-of-the-art performance across three distinct imaging modalities from four public datasets. On Vietnamese dense breast mammograms, we attain an mAP of 0.7 for mass detection and 0.55 for calcifications, yielding an absolute improvement of 16 percentage points. Additionally, a radiologist-supported evaluation of 100 mammograms from an out-of-distribution Chinese cohort demonstrates a twofold gain in lesion detection performance. For chest X-rays and angiography, we achieve an mAP of 0.25 for mass and 0.37 for stenosis detection, improving results by 4 and 7 percentage points, respectively. These results highlight the potential of our approach to advance robust and generalizable detection systems for medical imaging.
CVJan 4, 2022
Self-supervised Learning from 100 Million Medical ImagesFlorin C. Ghesu, Bogdan Georgescu, Awais Mansoor et al.
Building accurate and robust artificial intelligence systems for medical image assessment requires not only the research and design of advanced deep learning models but also the creation of large and curated sets of annotated training examples. Constructing such datasets, however, is often very costly -- due to the complex nature of annotation tasks and the high level of expertise required for the interpretation of medical images (e.g., expert radiologists). To counter this limitation, we propose a method for self-supervised learning of rich image features based on contrastive learning and online feature clustering. For this purpose we leverage large training datasets of over 100,000,000 medical images of various modalities, including radiography, computed tomography (CT), magnetic resonance (MR) imaging and ultrasonography. We propose to use these features to guide model training in supervised and hybrid self-supervised/supervised regime on various downstream tasks. We highlight a number of advantages of this strategy on challenging image assessment problems in radiography, CT and MR: 1) Significant increase in accuracy compared to the state-of-the-art (e.g., AUC boost of 3-7% for detection of abnormalities from chest radiography scans and hemorrhage detection on brain CT); 2) Acceleration of model convergence during training by up to 85% compared to using no pretraining (e.g., 83% when training a model for detection of brain metastases in MR scans); 3) Increase in robustness to various image augmentations, such as intensity variations, rotations or scaling reflective of data variation seen in the field.
CVApr 12, 2021
Robust Classification from Noisy Labels: Integrating Additional Knowledge for Chest Radiography Abnormality AssessmentSebastian Gündel, Arnaud A. A. Setio, Florin C. Ghesu et al.
Chest radiography is the most common radiographic examination performed in daily clinical practice for the detection of various heart and lung abnormalities. The large amount of data to be read and reported, with more than 100 studies per day for a single radiologist, poses a challenge in consistently maintaining high interpretation accuracy. The introduction of large-scale public datasets has led to a series of novel systems for automated abnormality classification. However, the labels of these datasets were obtained using natural language processed medical reports, yielding a large degree of label noise that can impact the performance. In this study, we propose novel training strategies that handle label noise from such suboptimal data. Prior label probabilities were measured on a subset of training data re-read by 4 board-certified radiologists and were used during training to increase the robustness of the training model to the label noise. Furthermore, we exploit the high comorbidity of abnormalities observed in chest radiography and incorporate this information to further reduce the impact of label noise. Additionally, anatomical knowledge is incorporated by training the system to predict lung and heart segmentation, as well as spatial knowledge labels. To deal with multiple datasets and images derived from various scanners that apply different post-processing techniques, we introduce a novel image normalization strategy. Experiments were performed on an extensive collection of 297,541 chest radiographs from 86,876 patients, leading to a state-of-the-art performance level for 17 abnormalities from 2 datasets. With an average AUC score of 0.880 across all abnormalities, our proposed training strategies can be used to significantly improve performance scores.
IVAug 13, 2020
Automated detection and quantification of COVID-19 airspace disease on chest radiographs: A novel approach achieving radiologist-level performance using a CNN trained on digital reconstructed radiographs (DRRs) from CT-based ground-truthEduardo Mortani Barbosa, Warren B. Gefter, Rochelle Yang et al.
Purpose: To leverage volumetric quantification of airspace disease (AD) derived from a superior modality (CT) serving as ground truth, projected onto digitally reconstructed radiographs (DRRs) to: 1) train a convolutional neural network to quantify airspace disease on paired CXRs; and 2) compare the DRR-trained CNN to expert human readers in the CXR evaluation of patients with confirmed COVID-19. Materials and Methods: We retrospectively selected a cohort of 86 COVID-19 patients (with positive RT-PCR), from March-May 2020 at a tertiary hospital in the northeastern USA, who underwent chest CT and CXR within 48 hrs. The ground truth volumetric percentage of COVID-19 related AD (POv) was established by manual AD segmentation on CT. The resulting 3D masks were projected into 2D anterior-posterior digitally reconstructed radiographs (DRR) to compute area-based AD percentage (POa). A convolutional neural network (CNN) was trained with DRR images generated from a larger-scale CT dataset of COVID-19 and non-COVID-19 patients, automatically segmenting lungs, AD and quantifying POa on CXR. CNN POa results were compared to POa quantified on CXR by two expert readers and to the POv ground-truth, by computing correlations and mean absolute errors. Results: Bootstrap mean absolute error (MAE) and correlations between POa and POv were 11.98% [11.05%-12.47%] and 0.77 [0.70-0.82] for average of expert readers, and 9.56%-9.78% [8.83%-10.22%] and 0.78-0.81 [0.73-0.85] for the CNN, respectively. Conclusion: Our CNN trained with DRR using CT-derived airspace quantification achieved expert radiologist level of accuracy in the quantification of airspace disease on CXR, in patients with positive RT-PCR for COVID-19.
IVAug 5, 2020
Extracting and Leveraging Nodule Features with Lung Inpainting for Local Feature AugmentationSebastian Guendel, Arnaud Arindra Adiyoso Setio, Sasa Grbic et al.
Chest X-ray (CXR) is the most common examination for fast detection of pulmonary abnormalities. Recently, automated algorithms have been developed to classify multiple diseases and abnormalities in CXR scans. However, because of the limited availability of scans containing nodules and the subtle properties of nodules in CXRs, state-of-the-art methods do not perform well on nodule classification. To create additional data for the training process, standard augmentation techniques are applied. However, the variance introduced by these methods are limited as the images are typically modified globally. In this paper, we propose a method for local feature augmentation by extracting local nodule features using a generative inpainting network. The network is applied to generate realistic, healthy tissue and structures in patches containing nodules. The nodules are entirely removed in the inpainted representation. The extraction of the nodule features is processed by subtraction of the inpainted patch from the nodule patch. With arbitrary displacement of the extracted nodules in the lung area across different CXR scans and further local modifications during training, we significantly increase the nodule classification performance and outperform state-of-the-art augmentation methods.
IVJul 8, 2020
Quantifying and Leveraging Predictive Uncertainty for Medical Image AssessmentFlorin C. Ghesu, Bogdan Georgescu, Awais Mansoor et al.
The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance. An additional example is the classification of anatomical views based on 2D Ultrasound images. Often, the anatomical context captured in a frame is not sufficient to recognize the underlying anatomy. Current machine learning solutions for these problems are typically limited to providing probabilistic predictions, relying on the capacity of underlying models to adapt to limited information and the high degree of label noise. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose a system that learns not only the probabilistic estimate for classification, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that this approach is essential to account for the inherent ambiguity characteristic of medical images from different radiologic exams including computed radiography, ultrasonography and magnetic resonance imaging. In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e.g., by 8% to 0.91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs. In addition, we show that using uncertainty-driven bootstrapping to filter the training data, one can achieve a significant increase in robustness and accuracy.
IVJun 9, 2020
Machine Learning Automatically Detects COVID-19 using Chest CTs in a Large Multicenter CohortEduardo Jose Mortani Barbosa, Bogdan Georgescu, Shikha Chaganti et al.
Objectives: To investigate machine-learning classifiers and interpretable models using chest CT for detection of COVID-19 and differentiation from other pneumonias, ILD and normal CTs. Methods: Our retrospective multi-institutional study obtained 2096 chest CTs from 16 institutions (including 1077 COVID-19 patients). Training/testing cohorts included 927/100 COVID-19, 388/33 ILD, 189/33 other pneumonias, and 559/34 normal (no pathologies) CTs. A metric-based approach for classification of COVID-19 used interpretable features, relying on logistic regression and random forests. A deep learning-based classifier differentiated COVID-19 via 3D features extracted directly from CT attenuation and probability distribution of airspace opacities. Results: Most discriminative features of COVID-19 are percentage of airspace opacity and peripheral and basal predominant opacities, concordant with the typical characterization of COVID-19 in the literature. Unsupervised hierarchical clustering compares feature distribution across COVID-19 and control cohorts. The metrics-based classifier achieved AUC=0.83, sensitivity=0.74, and specificity=0.79 of versus respectively 0.93, 0.90, and 0.83 for the DL-based classifier. Most of ambiguity comes from non-COVID-19 pneumonia with manifestations that overlap with COVID-19, as well as mild COVID-19 cases. Non-COVID-19 classification performance is 91% for ILD, 64% for other pneumonias and 94% for no pathologies, which demonstrates the robustness of our method against different compositions of control groups. Conclusions: Our new method accurately discriminates COVID-19 from other types of pneumonia, ILD, and no pathologies CTs, using quantitative imaging features derived from chest CT, while balancing interpretability of results and classification performance, and therefore may be useful to facilitate diagnosis of COVID-19.
IVMay 5, 2020
3D Tomographic Pattern Synthesis for Enhancing the Quantification of COVID-19Siqi Liu, Bogdan Georgescu, Zhoubing Xu et al.
The Coronavirus Disease (COVID-19) has affected 1.8 million people and resulted in more than 110,000 deaths as of April 12, 2020. Several studies have shown that tomographic patterns seen on chest Computed Tomography (CT), such as ground-glass opacities, consolidations, and crazy paving pattern, are correlated with the disease severity and progression. CT imaging can thus emerge as an important modality for the management of COVID-19 patients. AI-based solutions can be used to support CT based quantitative reporting and make reading efficient and reproducible if quantitative biomarkers, such as the Percentage of Opacity (PO), can be automatically computed. However, COVID-19 has posed unique challenges to the development of AI, specifically concerning the availability of appropriate image data and annotations at scale. In this paper, we propose to use synthetic datasets to augment an existing COVID-19 database to tackle these challenges. We train a Generative Adversarial Network (GAN) to inpaint COVID-19 related tomographic patterns on chest CTs from patients without infectious diseases. Additionally, we leverage location priors derived from manually labeled COVID-19 chest CTs patients to generate appropriate abnormality distributions. Synthetic data are used to improve both lung segmentation and segmentation of COVID-19 patterns by adding 20% of synthetic data to the real COVID-19 training data. We collected 2143 chest CTs, containing 327 COVID-19 positive cases, acquired from 12 sites across 7 countries. By testing on 100 COVID-19 positive and 100 control cases, we show that synthetic data can help improve both lung segmentation (+6.02% lesion inclusion rate) and abnormality segmentation (+2.78% dice coefficient), leading to an overall more accurate PO computation (+2.82% Pearson coefficient).
IVApr 2, 2020
Automated Quantification of CT Patterns Associated with COVID-19 from Chest CTShikha Chaganti, Abishek Balachandran, Guillaume Chabin et al.
Purpose: To present a method that automatically segments and quantifies abnormal CT patterns commonly present in coronavirus disease 2019 (COVID-19), namely ground glass opacities and consolidations. Materials and Methods: In this retrospective study, the proposed method takes as input a non-contrasted chest CT and segments the lesions, lungs, and lobes in three dimensions, based on a dataset of 9749 chest CT volumes. The method outputs two combined measures of the severity of lung and lobe involvement, quantifying both the extent of COVID-19 abnormalities and presence of high opacities, based on deep learning and deep reinforcement learning. The first measure of (PO, PHO) is global, while the second of (LSS, LHOS) is lobewise. Evaluation of the algorithm is reported on CTs of 200 participants (100 COVID-19 confirmed patients and 100 healthy controls) from institutions from Canada, Europe and the United States collected between 2002-Present (April, 2020). Ground truth is established by manual annotations of lesions, lungs, and lobes. Correlation and regression analyses were performed to compare the prediction to the ground truth. Results: Pearson correlation coefficient between method prediction and ground truth for COVID-19 cases was calculated as 0.92 for PO (P < .001), 0.97 for PHO(P < .001), 0.91 for LSS (P < .001), 0.90 for LHOS (P < .001). 98 of 100 healthy controls had a predicted PO of less than 1%, 2 had between 1-2%. Automated processing time to compute the severity scores was 10 seconds per case compared to 30 minutes required for manual annotations. Conclusion: A new method segments regions of CT abnormalities associated with COVID-19 and computes (PO, PHO), as well as (LSS, LHOS) severity scores.
IVMar 18, 2020
Graph Attention Network based Pruning for Reconstructing 3D Liver Vessel Morphology from Contrasted CT ImagesDonghao Zhang, Siqi Liu, Shikha Chaganti et al.
With the injection of contrast material into blood vessels, multi-phase contrasted CT images can enhance the visibility of vessel networks in the human body. Reconstructing the 3D geometric morphology of liver vessels from the contrasted CT images can enable multiple liver preoperative surgical planning applications. Automatic reconstruction of liver vessel morphology remains a challenging problem due to the morphological complexity of liver vessels and the inconsistent vessel intensities among different multi-phase contrasted CT images. On the other side, high integrity is required for the 3D reconstruction to avoid decision making biases. In this paper, we propose a framework for liver vessel morphology reconstruction using both a fully convolutional neural network and a graph attention network. A fully convolutional neural network is first trained to produce the liver vessel centerline heatmap. An over-reconstructed liver vessel graph model is then traced based on the heatmap using an image processing based algorithm. We use a graph attention network to prune the false-positive branches by predicting the presence probability of each segmented branch in the initial reconstruction using the aggregated CNN features. We evaluated the proposed framework on an in-house dataset consisting of 418 multi-phase abdomen CT images with contrast. The proposed graph network pruning improves the overall reconstruction F1 score by 6.4% over the baseline. It also outperformed the other state-of-the-art curvilinear structure reconstruction algorithms.
IVMar 8, 2020
No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial AttacksSiqi Liu, Arnaud Arindra Adiyoso Setio, Florin C. Ghesu et al.
Detecting malignant pulmonary nodules at an early stage can allow medical interventions which may increase the survival rate of lung cancer patients. Using computer vision techniques to detect nodules can improve the sensitivity and the speed of interpreting chest CT for lung cancer screening. Many studies have used CNNs to detect nodule candidates. Though such approaches have been shown to outperform the conventional image processing based methods regarding the detection accuracy, CNNs are also known to be limited to generalize on under-represented samples in the training set and prone to imperceptible noise perturbations. Such limitations can not be easily addressed by scaling up the dataset or the models. In this work, we propose to add adversarial synthetic nodules and adversarial attack samples to the training data to improve the generalization and the robustness of the lung nodule detection systems. To generate hard examples of nodules from a differentiable nodule synthesizer, we use projected gradient descent (PGD) to search the latent code within a bounded neighbourhood that would generate nodules to decrease the detector response. To make the network more robust to unanticipated noise perturbations, we use PGD to search for noise patterns that can trigger the network to give over-confident mistakes. By evaluating on two different benchmark datasets containing consensus annotations from three radiologists, we show that the proposed techniques can improve the detection performance on real CT data. To understand the limitations of both the conventional networks and the proposed augmented networks, we also perform stress-tests on the false positive reduction networks by feeding different types of artificially produced patches. We show that the augmented networks are more robust to both under-represented nodules as well as resistant to noise perturbations.
CVJun 18, 2019
Quantifying and Leveraging Classification Uncertainty for Chest Radiograph AssessmentFlorin C. Ghesu, Bogdan Georgescu, Eli Gibson et al.
The interpretation of chest radiographs is an essential task for the detection of thoracic diseases and abnormalities. However, it is a challenging problem with high inter-rater variability and inherent ambiguity due to inconclusive evidence in the data, limited data quality or subjective definitions of disease appearance. Current deep learning solutions for chest radiograph abnormality classification are typically limited to providing probabilistic predictions, relying on the capacity of learning models to adapt to the high degree of label noise and become robust to the enumerated causal factors. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose an automatic system that learns not only the probabilistic estimate on the presence of an abnormality, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that explicitly learning the classification uncertainty as an orthogonal measure to the predicted output, is essential to account for the inherent variability characteristic of this data. Experiments were conducted on two datasets of chest radiographs of over 85,000 patients. Sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC, e.g., by 8% to 0.91 with an expected rejection rate of under 25%. Eliminating training samples using uncertainty-driven bootstrapping, enables a significant increase in robustness and accuracy. In addition, we present a multi-reader study showing that the predictive uncertainty is indicative of reader errors.
CVMay 15, 2019
Multi-task Learning for Chest X-ray Abnormality Classification on Noisy LabelsSebastian Guendel, Florin C. Ghesu, Sasa Grbic et al.
Chest X-ray (CXR) is the most common X-ray examination performed in daily clinical practice for the diagnosis of various heart and lung abnormalities. The large amount of data to be read and reported, with 100+ studies per day for a single radiologist, poses a challenge in maintaining consistently high interpretation accuracy. In this work, we propose a method for the classification of different abnormalities based on CXR scans of the human body. The system is based on a novel multi-task deep learning architecture that in addition to the abnormality classification, supports the segmentation of the lungs and heart and classification of regions where the abnormality is located. We demonstrate that by training these tasks concurrently, one can increase the classification performance of the model. Experiments were performed on an extensive collection of 297,541 chest X-ray images from 86,876 patients, leading to a state-of-the-art performance level of 0.883 AUC on average for 12 different abnormalities. We also conducted a detailed performance analysis and compared the accuracy of our system with 3 board-certified radiologists. In this context, we highlight the high level of label noise inherent to this problem. On a reduced subset containing only cases with high confidence reference labels based on the consensus of the 3 radiologists, our system reached an average AUC of 0.945.
CVDec 28, 2018
Class-Aware Adversarial Lung Nodule Synthesis in CT ImagesJie Yang, Siqi Liu, Sasa Grbic et al.
Though large-scale datasets are essential for training deep learning systems, it is expensive to scale up the collection of medical imaging datasets. Synthesizing the objects of interests, such as lung nodules, in medical images based on the distribution of annotated datasets can be helpful for improving the supervised learning tasks, especially when the datasets are limited by size and class balance. In this paper, we propose the class-aware adversarial synthesis framework to synthesize lung nodules in CT images. The framework is built with a coarse-to-fine patch in-painter (generator) and two class-aware discriminators. By conditioning on the random latent variables and the target nodule labels, the trained networks are able to generate diverse nodules given the same context. By evaluating on the public LIDC-IDRI dataset, we demonstrate an example application of the proposed framework for improving the accuracy of the lung nodule malignancy estimation as a binary classification problem, which is important in the lung screening scenario. We show that combining the real image patches and the synthetic lung nodules in the training set can improve the mean AUC classification score across different network architectures by 2%.
CVDec 4, 2018
Decompose to manipulate: Manipulable Object Synthesis in 3D Medical Images with Structured Image DecompositionSiqi Liu, Eli Gibson, Sasa Grbic et al.
The performance of medical image analysis systems is constrained by the quantity of high-quality image annotations. Such systems require data to be annotated by experts with years of training, especially when diagnostic decisions are involved. Such datasets are thus hard to scale up. In this context, it is hard for supervised learning systems to generalize to the cases that are rare in the training set but would be present in real-world clinical practices. We believe that the synthetic image samples generated by a system trained on the real data can be useful for improving the supervised learning tasks in the medical image analysis applications. Allowing the image synthesis to be manipulable could help synthetic images provide complementary information to the training data rather than simply duplicating the real-data manifold. In this paper, we propose a framework for synthesizing 3D objects, such as pulmonary nodules, in 3D medical images with manipulable properties. The manipulation is enabled by decomposing of the object of interests into its segmentation mask and a 1D vector containing the residual information. The synthetic object is refined and blended into the image context with two adversarial discriminators. We evaluate the proposed framework on lung nodules in 3D chest CT images and show that the proposed framework could generate realistic nodules with manipulable shapes, textures and locations, etc. By sampling from both the synthetic nodules and the real nodules from 2800 3D CT volumes during the classifier training, we show the synthetic patches could improve the overall nodule detection performance by average 8.44% competition performance metric (CPM) score.
CVMay 25, 2018
Less is More: Simultaneous View Classification and Landmark Detection for Abdominal Ultrasound ImagesZhoubing Xu, Yuankai Huo, JinHyeong Park et al.
An abdominal ultrasound examination, which is the most common ultrasound examination, requires substantial manual efforts to acquire standard abdominal organ views, annotate the views in texts, and record clinically relevant organ measurements. Hence, automatic view classification and landmark detection of the organs can be instrumental to streamline the examination workflow. However, this is a challenging problem given not only the inherent difficulties from the ultrasound modality, e.g., low contrast and large variations, but also the heterogeneity across tasks, i.e., one classification task for all views, and then one landmark detection task for each relevant view. While convolutional neural networks (CNN) have demonstrated more promising outcomes on ultrasound image analytics than traditional machine learning approaches, it becomes impractical to deploy multiple networks (one for each task) due to the limited computational and memory resources on most existing ultrasound scanners. To overcome such limits, we propose a multi-task learning framework to handle all the tasks by a single network. This network is integrated to perform view classification and landmark detection simultaneously; it is also equipped with global convolutional kernels, coordinate constraints, and a conditional adversarial module to leverage the performances. In an experimental study based on 187,219 ultrasound images, with the proposed simplified approach we achieve (1) view classification accuracy better than the agreement between two clinical experts and (2) landmark-based measurement errors on par with inter-user variability. The multi-task approach also benefits from sharing the feature extraction during the training process across all tasks and, as a result, outperforms the approaches that address each task individually.
CVApr 14, 2018
Select, Attend, and Transfer: Light, Learnable Skip ConnectionsSaeid Asgari Taghanaki, Aicha Bentaieb, Anmol Sharma et al.
Skip connections in deep networks have improved both segmentation and classification performance by facilitating the training of deeper network architectures, and reducing the risks for vanishing gradients. They equip encoder-decoder-like networks with richer feature representations, but at the cost of higher memory usage, computation, and possibly resulting in transferring non-discriminative feature maps. In this paper, we focus on improving skip connections used in segmentation networks (e.g., U-Net, V-Net, and The One Hundred Layers Tiramisu (DensNet) architectures). We propose light, learnable skip connections which learn to first select the most discriminative channels and then attend to the most discriminative regions of the selected feature maps. The output of the proposed skip connections is a unique feature map which not only reduces the memory usage and network parameters to a high extent, but also improves segmentation accuracy. We evaluate the proposed method on three different 2D and volumetric datasets and demonstrate that the proposed light, learnable skip connections can outperform the traditional heavy skip connections in terms of segmentation accuracy, memory usage, and number of network parameters.
CVMar 12, 2018
Learning to recognize Abnormalities in Chest X-Rays with Location-Aware Dense NetworksSebastian Guendel, Sasa Grbic, Bogdan Georgescu et al.
Chest X-ray is the most common medical imaging exam used to assess multiple pathologies. Automated algorithms and tools have the potential to support the reading workflow, improve efficiency, and reduce reading errors. With the availability of large scale data sets, several methods have been proposed to classify pathologies on chest X-ray images. However, most methods report performance based on random image based splitting, ignoring the high probability of the same patient appearing in both training and test set. In addition, most methods fail to explicitly incorporate the spatial information of abnormalities or utilize the high resolution images. We propose a novel approach based on location aware Dense Networks (DNetLoc), whereby we incorporate both high-resolution image data and spatial information for abnormality classification. We evaluate our method on the largest data set reported in the community, containing a total of 86,876 patients and 297,541 chest X-ray images. We achieve (i) the best average AUC score for published training and test splits on the single benchmarking data set (ChestX-Ray14), and (ii) improved AUC scores when the pathology location information is explicitly used. To foster future research we demonstrate the limitations of the current benchmarking setup and provide new reference patient-wise splits for the used data sets. This could support consistent and meaningful benchmarking of future methods on the largest publicly available data sets.
CVNov 23, 2017
3D Anisotropic Hybrid Network: Transferring Convolutional Features from 2D Images to 3D Anisotropic VolumesSiqi Liu, Daguang Xu, S. Kevin Zhou et al.
While deep convolutional neural networks (CNN) have been successfully applied for 2D image analysis, it is still challenging to apply them to 3D anisotropic volumes, especially when the within-slice resolution is much higher than the between-slice resolution and when the amount of 3D volumes is relatively small. On one hand, direct learning of CNN with 3D convolution kernels suffers from the lack of data and likely ends up with poor generalization; insufficient GPU memory limits the model size or representational power. On the other hand, applying 2D CNN with generalizable features to 2D slices ignores between-slice information. Coupling 2D network with LSTM to further handle the between-slice information is not optimal due to the difficulty in LSTM learning. To overcome the above challenges, we propose a 3D Anisotropic Hybrid Network (AH-Net) that transfers convolutional features learned from 2D images to 3D anisotropic volumes. Such a transfer inherits the desired strong generalization capability for within-slice information while naturally exploiting between-slice information for more effective modelling. The focal loss is further utilized for more effective end-to-end learning. We experiment with the proposed 3D AH-Net on two different medical image analysis tasks, namely lesion detection from a Digital Breast Tomosynthesis volume, and liver and liver tumor segmentation from a Computed Tomography volume and obtain the state-of-the-art results.
CVJul 25, 2017
Automatic Liver Segmentation Using an Adversarial Image-to-Image NetworkDong Yang, Daguang Xu, S. Kevin Zhou et al.
Automatic liver segmentation in 3D medical images is essential in many clinical applications, such as pathological diagnosis of hepatic diseases, surgical planning, and postoperative assessment. However, it is still a very challenging task due to the complex background, fuzzy boundary, and various appearance of liver. In this paper, we propose an automatic and efficient algorithm to segment liver from 3D CT volumes. A deep image-to-image network (DI2IN) is first deployed to generate the liver segmentation, employing a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. Then an adversarial network is utilized during training process to discriminate the output of DI2IN from ground truth, which further boosts the performance of DI2IN. The proposed method is trained on an annotated dataset of 1000 CT volumes with various different scanning protocols (e.g., contrast and non-contrast, various resolution and position) and large variations in populations (e.g., ages and pathology). Our approach outperforms the state-of-the-art solutions in terms of segmentation accuracy and computing efficiency.
CVNov 30, 2016
An Artificial Agent for Robust Image RegistrationRui Liao, Shun Miao, Pierre de Tournemire et al.
3-D image registration, which involves aligning two or more images, is a critical step in a variety of medical applications from diagnosis to therapy. Image registration is commonly performed by optimizing an image matching metric as a cost function. However, this task is challenging due to the non-convex nature of the matching metric over the plausible registration parameter space and insufficient approaches for a robust optimization. As a result, current approaches are often customized to a specific problem and sensitive to image quality and artifacts. In this paper, we propose a completely different approach to image registration, inspired by how experts perform the task. We first cast the image registration problem as a "strategy learning" process, where the goal is to find the best sequence of motion actions (e.g. up, down, etc.) that yields image alignment. Within this approach, an artificial agent is learned, modeled using deep convolutional neural networks, with 3D raw image data as the input, and the next optimal action as the output. To cope with the dimensionality of the problem, we propose a greedy supervised approach for an end-to-end training, coupled with attention-driven hierarchical strategy. The resulting registration approach inherently encodes both a data-driven matching metric and an optimal registration strategy (policy). We demonstrate, on two 3-D/3-D medical image registration examples with drastically different nature of challenges, that the artificial agent outperforms several state-of-art registration methods by a large margin in terms of both accuracy and robustness.