Andriy Fedorov

h-index35

11papers

546citations

Novelty25%

AI Score38

Ranked #88,545 of 194,257 authors (top 46%)#1,044 in IV (top 24%)

11 Papers

6.3IVSep 30, 2024Code

AI generated annotations for Breast, Brain, Liver, Lungs and Prostate cancer collections in National Cancer Institute Imaging Data Commons

Gowtham Krishnan Murugesan, Diana McCrumb, Rahul Soni et al.

AI in Medical Imaging project aims to enhance the National Cancer Institute's (NCI) Image Data Commons (IDC) by developing nnU-Net models and providing AI-assisted segmentations for cancer radiology images. We created high-quality, AI-annotated imaging datasets for 11 IDC collections. These datasets include images from various modalities, such as computed tomography (CT) and magnetic resonance imaging (MRI), covering the lungs, breast, brain, kidneys, prostate, and liver. The nnU-Net models were trained using open-source datasets. A portion of the AI-generated annotations was reviewed and corrected by radiologists. Both the AI and radiologist annotations were encoded in compliance with the the Digital Imaging and Communications in Medicine (DICOM) standard, ensuring seamless integration into the IDC collections. All models, images, and annotations are publicly accessible, facilitating further research and development in cancer imaging. This work supports the advancement of imaging tools and algorithms by providing comprehensive and accurate annotated datasets.

6.3IVMar 22, 2024Code

Towards Automatic Abdominal MRI Organ Segmentation: Leveraging Synthesized Data Generated From CT Labels

Cosmin Ciausu, Deepa Krishnaswamy, Benjamin Billot et al.

Deep learning has shown great promise in the ability to automatically annotate organs in magnetic resonance imaging (MRI) scans, for example, of the brain. However, despite advancements in the field, the ability to accurately segment abdominal organs remains difficult across MR. In part, this may be explained by the much greater variability in image appearance and severely limited availability of training labels. The inherent nature of computed tomography (CT) scans makes it easier to annotate, resulting in a larger availability of expert annotations for the latter. We leverage a modality-agnostic domain randomization approach, utilizing CT label maps to generate synthetic images on-the-fly during training, further used to train a U-Net segmentation network for abdominal organs segmentation. Our approach shows comparable results compared to fully-supervised segmentation methods trained on MR data. Our method results in Dice scores of 0.90 (0.08) and 0.91 (0.08) for the right and left kidney respectively, compared to a pretrained nnU-Net model yielding 0.87 (0.20) and 0.91 (0.03). We will make our code publicly available.

5.1IVJul 23, 2025Code

Benchmarking of Deep Learning Methods for Generic MRI Multi-Organ Abdominal Segmentation

Deepa Krishnaswamy, Cosmin Ciausu, Steve Pieper et al.

Recent advances in deep learning have led to robust automated tools for segmentation of abdominal computed tomography (CT). Meanwhile, segmentation of magnetic resonance imaging (MRI) is substantially more challenging due to the inherent signal variability and the increased effort required for annotating training datasets. Hence, existing approaches are trained on limited sets of MRI sequences, which might limit their generalizability. To characterize the landscape of MRI abdominal segmentation tools, we present here a comprehensive benchmarking of the three state-of-the-art and open-source models: MRSegmentator, MRISegmentator-Abdomen, and TotalSegmentator MRI. Since these models are trained using labor-intensive manual annotation cycles, we also introduce and evaluate ABDSynth, a SynthSeg-based model purely trained on widely available CT segmentations (no real images). More generally, we assess accuracy and generalizability by leveraging three public datasets (not seen by any of the evaluated methods during their training), which span all major manufacturers, five MRI sequences, as well as a variety of subject conditions, voxel resolutions, and fields-of-view. Our results reveal that MRSegmentator achieves the best performance and is most generalizable. In contrast, ABDSynth yields slightly less accurate results, but its relaxed requirements in training data make it an alternative when the annotation budget is limited. The evaluation code and datasets are given for future benchmarking at https://github.com/deepakri201/AbdoBench, along with inference code and weights for ABDSynth.

3.6IVApr 16, 2024Code

Automatic classification of prostate MR series type using image content and metadata

Deepa Krishnaswamy, Bálint Kovács, Stefan Denner et al.

With the wealth of medical image data, efficient curation is essential. Assigning the sequence type to magnetic resonance images is necessary for scientific studies and artificial intelligence-based analysis. However, incomplete or missing metadata prevents effective automation. We therefore propose a deep-learning method for classification of prostate cancer scanning sequences based on a combination of image data and DICOM metadata. We demonstrate superior results compared to metadata or image data alone, and make our code publicly available at https://github.com/deepakri201/DICOMScanClassification.

6.1IVJun 14, 2021

Highdicom: A Python library for standardized encoding of image annotations and machine learning model outputs in pathology and radiology

Christopher P. Bridge, Chris Gorman, Steven Pieper et al.

Machine learning is revolutionizing image-based diagnostics in pathology and radiology. ML models have shown promising results in research settings, but their lack of interoperability has been a major barrier for clinical integration and evaluation. The DICOM a standard specifies Information Object Definitions and Services for the representation and communication of digital images and related information, including image-derived annotations and analysis results. However, the complexity of the standard represents an obstacle for its adoption in the ML community and creates a need for software libraries and tools that simplify working with data sets in DICOM format. Here we present the highdicom library, which provides a high-level application programming interface for the Python programming language that abstracts low-level details of the standard and enables encoding and decoding of image-derived information in DICOM format in a few lines of Python code. The highdicom library ties into the extensive Python ecosystem for image processing and machine learning. Simultaneously, by simplifying creation and parsing of DICOM-compliant files, highdicom achieves interoperability with the medical imaging systems that hold the data used to train and run ML models, and ultimately communicate and store model outputs for clinical use. We demonstrate through experiments with slide microscopy and computed tomography imaging, that, by bridging these two ecosystems, highdicom enables developers to train and evaluate state-of-the-art ML models in pathology and radiology while remaining compliant with the DICOM standard and interoperable with clinical systems at all stages. To promote standardization of ML research and streamline the ML model development and deployment process, we made the library available free and open-source.

2.3OTDec 27, 2019Code

Open Source Software Sustainability Models: Initial White Paper from the Informatics Technology for Cancer Research Sustainability and Industry Partnership Work Group

Y. Ye, R. D. Boyce, M. K. Davis et al.

The Sustainability and Industry Partnership Work Group (SIP-WG) is a part of the National Cancer Institute Informatics Technology for Cancer Research (ITCR) program. The charter of the SIP-WG is to investigate options of long-term sustainability of open source software (OSS) developed by the ITCR, in part by developing a collection of business model archetypes that can serve as sustainability plans for ITCR OSS development initiatives. The workgroup assembled models from the ITCR program, from other studies, and via engagement of its extensive network of relationships with other organizations (e.g., Chan Zuckerberg Initiative, Open Source Initiative and Software Sustainability Institute). This article reviews existing sustainability models and describes ten OSS use cases disseminated by the SIP-WG and others, and highlights five essential attributes (alignment with unmet scientific needs, dedicated development team, vibrant user community, feasible licensing model, and sustainable financial model) to assist academic software developers in achieving best practice in software sustainability.

7.1LGNov 26, 2019Code

ModelHub.AI: Dissemination Platform for Deep Learning Models

Ahmed Hosny, Michael Schwier, Christoph Berger et al.

Recent advances in artificial intelligence research have led to a profusion of studies that apply deep learning to problems in image analysis and natural language processing among others. Additionally, the availability of open-source computational frameworks has lowered the barriers to implementing state-of-the-art methods across multiple domains. Albeit leading to major performance breakthroughs in some tasks, effective dissemination of deep learning algorithms remains challenging, inhibiting reproducibility and benchmarking studies, impeding further validation, and ultimately hindering their effectiveness in the cumulative scientific progress. In developing a platform for sharing research outputs, we present ModelHub.AI (www.modelhub.ai), a community-driven container-based software engine and platform for the structured dissemination of deep learning models. For contributors, the engine controls data flow throughout the inference cycle, while the contributor-facing standard template exposes model-specific functions including inference, as well as pre- and post-processing. Python and RESTful Application programming interfaces (APIs) enable users to interact with models hosted on ModelHub.AI and allows both researchers and developers to utilize models out-of-the-box. ModelHub.AI is domain-, data-, and framework-agnostic, catering to different workflows and contributors' preferences.

8.3CVJul 16, 2018Code

Repeatability of Multiparametric Prostate MRI Radiomics Features

Michael Schwier, Joost van Griethuysen, Mark G Vangel et al.

In this study we assessed the repeatability of the values of radiomics features for small prostate tumors using test-retest Multiparametric Magnetic Resonance Imaging (mpMRI) images. The premise of radiomics is that quantitative image features can serve as biomarkers characterizing disease. For such biomarkers to be useful, repeatability is a basic requirement, meaning its value must remain stable between two scans, if the conditions remain stable. We investigated repeatability of radiomics features under various preprocessing and extraction configurations including various image normalization schemes, different image pre-filtering, 2D vs 3D texture computation, and different bin widths for image discretization. Image registration as means to re-identify regions of interest across time points was evaluated against human-expert segmented regions in both time points. Even though we found many radiomics features and preprocessing combinations with a high repeatability (Intraclass Correlation Coefficient (ICC) > 0.85), our results indicate that overall the repeatability is highly sensitive to the processing parameters (under certain configurations, it can be below 0.0). Image normalization, using a variety of approaches considered, did not result in consistent improvements in repeatability. There was also no consistent improvement of repeatability through the use of pre-filtering options, or by using image registration between timepoints to improve consistency of the region of interest localization. Based on these results we urge caution when interpreting radiomics features and advise paying close attention to the processing configuration details of reported results. Furthermore, we advocate reporting all processing details in radiomics studies and strongly recommend making the implementation available.

32.5IVJan 15, 2025Code

Vision Foundation Models for Computed Tomography

Suraj Pai, Ibrahim Hadzic, Dennis Bontempi et al.

Foundation models (FMs) have shown transformative potential in radiology by performing diverse, complex tasks across imaging modalities. Here, we developed CT-FM, a large-scale 3D image-based pre-trained model designed explicitly for various radiological tasks. CT-FM was pre-trained using 148,000 computed tomography (CT) scans from the Imaging Data Commons through label-agnostic contrastive learning. We evaluated CT-FM across four categories of tasks, namely, whole-body and tumor segmentation, head CT triage, medical image retrieval, and semantic understanding, showing superior performance against state-of-the-art models. Beyond quantitative success, CT-FM demonstrated the ability to cluster regions anatomically and identify similar anatomical and structural concepts across scans. Furthermore, it remained robust across test-retest settings and indicated reasonable salient regions attached to its embeddings. This study demonstrates the value of large-scale medical imaging foundation models and by open-sourcing the model weights, code, and data, aims to support more adaptable, reliable, and interpretable AI solutions in radiology.

0.9CVMay 7, 2017

Large scale digital prostate pathology image analysis combining feature extraction and deep neural network

Naiyun Zhou, Andrey Fedorov, Fiona Fennessy et al.

Histopathological assessments, including surgical resection and core needle biopsy, are the standard procedures in the diagnosis of the prostate cancer. Current interpretation of the histopathology images includes the determination of the tumor area, Gleason grading, and identification of certain prognosis-critical features. Such a process is not only tedious, but also prune to intra/inter-observe variabilities. Recently, FDA cleared the marketing of the first whole slide imaging system for digital pathology. This opens a new era for the computer aided prostate image analysis and feature extraction based on the digital histopathology images. In this work, we present an analysis pipeline that includes localization of the cancer region, grading, area ratio of different Gleason grades, and cytological/architectural feature extraction. The proposed algorithm combines the human engineered feature extraction as well as those learned by the deep neural network. Moreover, the entire pipeline is implemented to directly operate on the whole slide images produced by the digital scanners and is therefore potentially easy to translate into clinical practices. The algorithm is tested on 368 whole slide images from the TCGA data set and achieves an overall accuracy of 75% in differentiating Gleason 3+4 with 4+3 slides.

24.9CVMar 5, 2013

GBM Volumetry using the 3D Slicer Medical Image Computing Platform

Jan Egger, Tina Kapur, Andriy Fedorov et al.

Volumetric change in glioblastoma multiforme (GBM) over time is a critical factor in treatment decisions. Typically, the tumor volume is computed on a slice-by-slice basis using MRI scans obtained at regular intervals. (3D)Slicer - a free platform for biomedical research - provides an alternative to this manual slice-by-slice segmentation process, which is significantly faster and requires less user interaction. In this study, 4 physicians segmented GBMs in 10 patients, once using the competitive region-growing based GrowCut segmentation module of Slicer, and once purely by drawing boundaries completely manually on a slice-by-slice basis. Furthermore, we provide a variability analysis for three physicians for 12 GBMs. The time required for GrowCut segmentation was on an average 61% of the time required for a pure manual segmentation. A comparison of Slicer-based segmentation with manual slice-by-slice segmentation resulted in a Dice Similarity Coefficient of 88.43 +/- 5.23% and a Hausdorff Distance of 2.32 +/- 5.23 mm.