Arman Rahmim

MED-PH

h-index63

36papers

427citations

Novelty31%

AI Score48

Ranked #29,091 of 194,257 authors (top 15%)#2 in MED-PH (top 2%)

36 Papers

7.3IVSep 24, 2023Code

Generalized Dice Focal Loss trained 3D Residual UNet for Automated Lesion Segmentation in Whole-Body FDG PET/CT Images

Shadab Ahamed, Arman Rahmim

Automated segmentation of cancerous lesions in PET/CT images is a vital initial task for quantitative analysis. However, it is often challenging to train deep learning-based segmentation methods to high degree of accuracy due to the diversity of lesions in terms of their shapes, sizes, and radiotracer uptake levels. These lesions can be found in various parts of the body, often close to healthy organs that also show significant uptake. Consequently, developing a comprehensive PET/CT lesion segmentation model is a demanding endeavor for routine quantitative image analysis. In this work, we train a 3D Residual UNet using Generalized Dice Focal Loss function on the AutoPET challenge 2023 training dataset. We develop our models in a 5-fold cross-validation setting and ensemble the five models via average and weighted-average ensembling. On the preliminary test phase, the average ensemble achieved a Dice similarity coefficient (DSC), false-positive volume (FPV) and false negative volume (FNV) of 0.5417, 0.8261 ml, and 0.2538 ml, respectively, while the weighted-average ensemble achieved 0.5417, 0.8186 ml, and 0.2538 ml, respectively. Our algorithm can be accessed via this link: https://github.com/ahxmeds/autosegnet.

1.2MED-PHNov 7, 2022

Issues and Challenges in Applications of Artificial Intelligence to Nuclear Medicine -- The Bethesda Report (AI Summit 2022)

Arman Rahmim, Tyler J. Bradshaw, Irène Buvat et al. · uw

The SNMMI Artificial Intelligence (SNMMI-AI) Summit, organized by the SNMMI AI Task Force, took place in Bethesda, MD on March 21-22, 2022. It brought together various community members and stakeholders from academia, healthcare, industry, patient representatives, and government (NIH, FDA), and considered various key themes to envision and facilitate a bright future for routine, trustworthy use of AI in nuclear medicine. In what follows, essential issues, challenges, controversies and findings emphasized in the meeting are summarized.

6.8CVNov 16, 2023Code

Comprehensive framework for evaluation of deep neural networks in detection and quantification of lymphoma from PET/CT images: clinical insights, pitfalls, and observer agreement analyses

Shadab Ahamed, Yixi Xu, Sara Kurkowska et al.

This study addresses critical gaps in automated lymphoma segmentation from PET/CT images, focusing on issues often overlooked in existing literature. While deep learning has been applied for lymphoma lesion segmentation, few studies incorporate out-of-distribution testing, raising concerns about model generalizability across diverse imaging conditions and patient populations. We highlight the need to compare model performance with expert human annotators, including intra- and inter-observer variability, to understand task difficulty better. Most approaches focus on overall segmentation accuracy but overlook lesion-specific measures important for precise lesion detection and disease quantification. To address these gaps, we propose a clinically relevant framework for evaluating deep segmentation networks. Using this lesion measure-specific evaluation, we assess the performance of four deep networks (ResUNet, SegResNet, DynUNet, and SwinUNETR) across 611 cases from multi-institutional datasets, covering various lymphoma subtypes and lesion characteristics. Beyond standard metrics like the Dice similarity coefficient, we evaluate clinical lesion measures and their prediction errors. We also introduce detection criteria for lesion localization and propose a new detection Criterion 3 based on metabolic characteristics. We show that networks perform better on large, intense lesions with higher metabolic activity. Finally, we compare network performance to physicians via intra- and inter-observer variability analyses, demonstrating that network errors closely resemble those made by experts, i.e., the small and faint lesions remain challenging for both humans and networks. This study aims to improve automated lesion segmentation's clinical relevance, supporting better treatment decisions for lymphoma patients. The code is available at: https://github.com/microsoft/lymphoma-segmentation-dnn.

2.6CVMar 12, 2022

Tensor Radiomics: Paradigm for Systematic Incorporation of Multi-Flavoured Radiomics Features

Arman Rahmim, Amirhosein Toosi, Mohammad R. Salmanpour et al.

Radiomics features extract quantitative information from medical images, towards the derivation of biomarkers for clinical tasks, such as diagnosis, prognosis, or treatment response assessment. Different image discretization parameters (e.g. bin number or size), convolutional filters, segmentation perturbation, or multi-modality fusion levels can be used to generate radiomics features and ultimately signatures. Commonly, only one set of parameters is used; resulting in only one value or flavour for a given RF. We propose tensor radiomics (TR) where tensors of features calculated with multiple combinations of parameters (i.e. flavours) are utilized to optimize the construction of radiomics signatures. We present examples of TR as applied to PET/CT, MRI, and CT imaging invoking machine learning or deep learning solutions, and reproducibility analyses: (1) TR via varying bin sizes on CT images of lung cancer and PET-CT images of head & neck cancer (HNC) for overall survival prediction. A hybrid deep neural network, referred to as TR-Net, along with two ML-based flavour fusion methods showed improved accuracy compared to regular rediomics features. (2) TR built from different segmentation perturbations and different bin sizes for classification of late-stage lung cancer response to first-line immunotherapy using CT images. TR improved predicted patient responses. (3) TR via multi-flavour generated radiomics features in MR imaging showed improved reproducibility when compared to many single-flavour features. (4) TR via multiple PET/CT fusions in HNC. Flavours were built from different fusions using methods, such as Laplacian pyramids and wavelet transforms. TR improved overall survival prediction. Our results suggest that the proposed TR paradigm has the potential to improve performance capabilities in different medical imaging tasks.

3.6IVJul 14, 2024Code

Thyroidiomics: An Automated Pipeline for Segmentation and Classification of Thyroid Pathologies from Scintigraphy Images

Maziar Sabouri, Shadab Ahamed, Azin Asadzadeh et al.

The objective of this study was to develop an automated pipeline that enhances thyroid disease classification using thyroid scintigraphy images, aiming to decrease assessment time and increase diagnostic accuracy. Anterior thyroid scintigraphy images from 2,643 patients were collected and categorized into diffuse goiter (DG), multinodal goiter (MNG), and thyroiditis (TH) based on clinical reports, and then segmented by an expert. A ResUNet model was trained to perform auto-segmentation. Radiomic features were extracted from both physician (scenario 1) and ResUNet segmentations (scenario 2), followed by omitting highly correlated features using Spearman's correlation, and feature selection using Recursive Feature Elimination (RFE) with XGBoost as the core. All models were trained under leave-one-center-out cross-validation (LOCOCV) scheme, where nine instances of algorithms were iteratively trained and validated on data from eight centers and tested on the ninth for both scenarios separately. Segmentation performance was assessed using the Dice similarity coefficient (DSC), while classification performance was assessed using metrics, such as precision, recall, F1-score, accuracy, area under the Receiver Operating Characteristic (ROC AUC), and area under the precision-recall curve (PRC AUC). ResUNet achieved DSC values of 0.84$\pm$0.03, 0.71$\pm$0.06, and 0.86$\pm$0.02 for MNG, TH, and DG, respectively. Classification in scenario 1 achieved an accuracy of 0.76$\pm$0.04 and a ROC AUC of 0.92$\pm$0.02 while in scenario 2, classification yielded an accuracy of 0.74$\pm$0.05 and a ROC AUC of 0.90$\pm$0.02. The automated pipeline demonstrated comparable performance to physician segmentations on several classification metrics across different classes, effectively reducing assessment time while maintaining high diagnostic accuracy. Code available at: https://github.com/ahxmeds/thyroidiomics.git.

4.8IVJun 13, 2022Code

Assessing Privacy Leakage in Synthetic 3-D PET Imaging using Transversal GAN

Robert V. Bergen, Jean-Francois Rajotte, Fereshteh Yousefirizi et al.

Training computer-vision related algorithms on medical images for disease diagnosis or image segmentation is difficult in large part due to privacy concerns. For this reason, generative image models are highly sought after to facilitate data sharing. However, 3-D generative models are understudied, and investigation of their privacy leakage is needed. We introduce our 3-D generative model, Transversal GAN (TrGAN), using head & neck PET images which are conditioned on tumour masks as a case study. We define quantitative measures of image fidelity, utility and privacy for our model. These metrics are evaluated in the course of training to identify ideal fidelity, utility and privacy trade-offs and establish the relationships between these parameters. We show that the discriminator of the TrGAN is vulnerable to attack, and that an attacker can identify which samples were used in training with almost perfect accuracy (AUC = 0.99). We also show that an attacker with access to only the generator cannot reliably classify whether a sample had been used for training (AUC = 0.51). This suggests that TrGAN generators, but not discriminators, may be used for sharing synthetic 3-D PET data with minimal privacy risk while maintaining good utility and fidelity.

3.3MED-PHJul 26, 2024Code

How to Segment in 3D Using 2D Models: Automated 3D Segmentation of Prostate Cancer Metastatic Lesions on PET Volumes Using Multi-angle Maximum Intensity Projections and Diffusion Models

Amirhosein Toosi, Sara Harsini, François Bénard et al.

Prostate specific membrane antigen (PSMA) positron emission tomography/computed tomography (PET/CT) imaging provides a tremendously exciting frontier in visualization of prostate cancer (PCa) metastatic lesions. However, accurate segmentation of metastatic lesions is challenging due to low signal-to-noise ratios and variable sizes, shapes, and locations of the lesions. This study proposes a novel approach for automated segmentation of metastatic lesions in PSMA PET/CT 3D volumetric images using 2D denoising diffusion probabilistic models (DDPMs). Instead of 2D trans-axial slices or 3D volumes, the proposed approach segments the lesions on generated multi-angle maximum intensity projections (MA-MIPs) of the PSMA PET images, then obtains the final 3D segmentation masks from 3D ordered subset expectation maximization (OSEM) reconstruction of 2D MA-MIPs segmentations. Our proposed method achieved superior performance compared to state-of-the-art 3D segmentation approaches in terms of accuracy and robustness in detecting and segmenting small metastatic PCa lesions. The proposed method has significant potential as a tool for quantitative analysis of metastatic burden in PCa patients.

1.2MED-PHAug 1, 2024

Prognosis of COVID-19 using Artificial Intelligence: A Systematic Review and Meta-analysis

SaeedReza Motamedian, Sadra Mohaghegh, Elham Babadi Oregani et al.

Purpose: Artificial intelligence (AI) techniques have been extensively utilized for diagnosing and prognosis of several diseases in recent years. This study identifies, appraises and synthesizes published studies on the use of AI for the prognosis of COVID-19. Method: Electronic search was performed using Medline, Google Scholar, Scopus, Embase, Cochrane and ProQuest. Studies that examined machine learning or deep learning methods to determine the prognosis of COVID-19 using CT or chest X-ray images were included. Polled sensitivity, specificity area under the curve and diagnostic odds ratio were calculated. Result: A total of 36 articles were included; various prognosis-related issues, including disease severity, mechanical ventilation or admission to the intensive care unit and mortality, were investigated. Several AI models and architectures were employed, such as the Siamense model, support vector machine, Random Forest , eXtreme Gradient Boosting, and convolutional neural networks. The models achieved 71%, 88% and 67% sensitivity for mortality, severity assessment and need for ventilation, respectively. The specificity of 69%, 89% and 89% were reported for the aforementioned variables. Conclusion: Based on the included articles, machine learning and deep learning methods used for the prognosis of COVID-19 patients using radiomic features from CT or CXR images can help clinicians manage patients and allocate resources more effectively. These studies also demonstrate that combining patient demographic, clinical data, laboratory tests and radiomic features improves model performances.

8.5IVMar 25, 2024Code

MEDDAP: Medical Dataset Enhancement via Diversified Augmentation Pipeline

Yasamin Medghalchi, Niloufar Zakariaei, Arman Rahmim et al.

The effectiveness of Deep Neural Networks (DNNs) heavily relies on the abundance and accuracy of available training data. However, collecting and annotating data on a large scale is often both costly and time-intensive, particularly in medical cases where practitioners are already occupied with their duties. Moreover, ensuring that the model remains robust across various scenarios of image capture is crucial in medical domains, especially when dealing with ultrasound images that vary based on the settings of different devices and the manual operation of the transducer. To address this challenge, we introduce a novel pipeline called MEDDAP, which leverages Stable Diffusion (SD) models to augment existing small datasets by automatically generating new informative labeled samples. Pretrained checkpoints for SD are typically based on natural images, and training them for medical images requires significant GPU resources due to their heavy parameters. To overcome this challenge, we introduce USLoRA (Ultrasound Low-Rank Adaptation), a novel fine-tuning method tailored specifically for ultrasound applications. USLoRA allows for selective fine-tuning of weights within SD, requiring fewer than 0.1\% of parameters compared to fully fine-tuning only the UNet portion of SD. To enhance dataset diversity, we incorporate different adjectives into the generation process prompts, thereby desensitizing the classifiers to intensity changes across different images. This approach is inspired by clinicians' decision-making processes regarding breast tumors, where tumor shape often plays a more crucial role than intensity. In conclusion, our pipeline not only outperforms classifiers trained on the original dataset but also demonstrates superior performance when encountering unseen datasets. The source code is available at https://github.com/yasamin-med/MEDDAP.

11.3IVFeb 4, 2025Code

Adaptive Voxel-Weighted Loss Using L1 Norms in Deep Neural Networks for Detection and Segmentation of Prostate Cancer Lesions in PET/CT Images

Obed Korshie Dzikunu, Shadab Ahamed, Amirhossein Toosi et al.

This study proposes a new loss function for deep neural networks, L1-weighted Dice Focal Loss (L1DFL), that leverages L1 norms for adaptive weighting of voxels based on their classification difficulty, towards automated detection and segmentation of metastatic prostate cancer lesions in PET/CT scans. We obtained 380 PSMA [18-F] DCFPyL PET/CT scans of patients diagnosed with biochemical recurrence metastatic prostate cancer. We trained two 3D convolutional neural networks, Attention U-Net and SegResNet, and concatenated the PET and CT volumes channel-wise as input. The performance of our custom loss function was evaluated against the Dice and Dice Focal Loss functions. For clinical significance, we considered a detected region of interest (ROI) as a true positive if at least the voxel with the maximum standardized uptake value falls within the ROI. We assessed the models' performance based on the number of lesions in an image, tumour volume, activity, and extent of spread. The L1DFL outperformed the comparative loss functions by at least 13% on the test set. In addition, the F1 scores of the Dice Loss and the Dice Focal Loss were lower than that of L1DFL by at least 6% and 34%, respectively. The Dice Focal Loss yielded more false positives, whereas the Dice Loss was more sensitive to smaller volumes and struggled to segment larger lesions accurately. They also exhibited network-specific variations and yielded declines in segmentation accuracy with increased tumour spread. Our results demonstrate the potential of L1DFL to yield robust segmentation of metastatic prostate cancer lesions in PSMA PET/CT images. The results further highlight potential complexities arising from the variations in lesion characteristics that may influence automated prostate cancer tumour detection and segmentation. The code is publicly available at: https://github.com/ObedDzik/pca_segment.git.

7.1LGMay 21, 2025Code

AllMetrics: A Unified Python Library for Standardized Metric Evaluation and Robust Data Validation in Machine Learning

Morteza Alizadeh, Mehrdad Oveisi, Sonya Falahati et al.

Machine learning (ML) models rely heavily on consistent and accurate performance metrics to evaluate and compare their effectiveness. However, existing libraries often suffer from fragmentation, inconsistent implementations, and insufficient data validation protocols, leading to unreliable results. Existing libraries have often been developed independently and without adherence to a unified standard, particularly concerning the specific tasks they aim to support. As a result, each library tends to adopt its conventions for metric computation, input/output formatting, error handling, and data validation protocols. This lack of standardization leads to both implementation differences (ID) and reporting differences (RD), making it difficult to compare results across frameworks or ensure reliable evaluations. To address these issues, we introduce AllMetrics, an open-source unified Python library designed to standardize metric evaluation across diverse ML tasks, including regression, classification, clustering, segmentation, and image-to-image translation. The library implements class-specific reporting for multi-class tasks through configurable parameters to cover all use cases, while incorporating task-specific parameters to resolve metric computation discrepancies across implementations. Various datasets from domains like healthcare, finance, and real estate were applied to our library and compared with Python, Matlab, and R components to identify which yield similar results. AllMetrics combines a modular Application Programming Interface (API) with robust input validation mechanisms to ensure reproducibility and reliability in model evaluation. This paper presents the design principles, architectural components, and empirical analyses demonstrating the ability to mitigate evaluation errors and to enhance the trustworthiness of ML workflows.

8.6IVApr 22, 2025Code

Comprehensive Evaluation of Quantitative Measurements from Automated Deep Segmentations of PSMA PET/CT Images

Obed Korshie Dzikunu, Amirhossein Toosi, Shadab Ahamed et al.

This study performs a comprehensive evaluation of quantitative measurements as extracted from automated deep-learning-based segmentation methods, beyond traditional Dice Similarity Coefficient assessments, focusing on six quantitative metrics, namely SUVmax, SUVmean, total lesion activity (TLA), tumor volume (TMTV), lesion count, and lesion spread. We analyzed 380 prostate-specific membrane antigen (PSMA) targeted [18F]DCFPyL PET/CT scans of patients with biochemical recurrence of prostate cancer, training deep neural networks, U-Net, Attention U-Net and SegResNet with four loss functions: Dice Loss, Dice Cross Entropy, Dice Focal Loss, and our proposed L1 weighted Dice Focal Loss (L1DFL). Evaluations indicated that Attention U-Net paired with L1DFL achieved the strongest correlation with the ground truth (concordance correlation = 0.90-0.99 for SUVmax and TLA), whereas models employing the Dice Loss and the other two compound losses, particularly with SegResNet, underperformed. Equivalence testing (TOST, alpha = 0.05, Delta = 20%) confirmed high performance for SUV metrics, lesion count and TLA, with L1DFL yielding the best performance. By contrast, tumor volume and lesion spread exhibited greater variability. Bland-Altman, Coverage Probability, and Total Deviation Index analyses further highlighted that our proposed L1DFL minimizes variability in quantification of the ground truth clinical measures. The code is publicly available at: https://github.com/ObedDzik/pca\_segment.git.

3.6IVApr 30, 2024Code

IgCONDA-PET: Weakly-Supervised PET Anomaly Detection using Implicitly-Guided Attention-Conditional Counterfactual Diffusion Modeling -- a Multi-Center, Multi-Cancer, and Multi-Tracer Study

Shadab Ahamed, Arman Rahmim

Minimizing the need for pixel-level annotated data to train PET lesion detection and segmentation networks is highly desired and can be transformative, given time and cost constraints associated with expert annotations. Current unsupervised or weakly-supervised anomaly detection methods rely on autoencoder or generative adversarial networks (GANs) trained only on healthy data. While these approaches reduce annotation dependency, GAN-based methods are notably more challenging to train than non-GAN alternatives (such as autoencoders) due to issues such as the simultaneous optimization of two competing networks, mode collapse, and training instability. In this paper, we present the weakly-supervised $\textbf{I}$mplicitly-$\textbf{g}$uided $\textbf{CO}$u$\textbf{N}$terfactual diffusion model for $\textbf{D}$etecting $\textbf{A}$nomalies in $\textbf{PET}$ images (IgCONDA-PET). The solution is developed and validated using PET scans from six retrospective cohorts consisting of a total of 2652 cases (multi-cancer, multi-tracer) containing both local and public datasets (spanning multiple centers). The training is conditioned on image class labels (healthy vs. unhealthy) via attention modules, and we employ implicit diffusion guidance. We perform counterfactual generation which facilitates "unhealthy-to-healthy" domain translation by generating a synthetic, healthy version of an unhealthy input image, enabling the detection of anomalies through the calculated differences. The performance of our method was compared against several other deep learning based weakly-supervised or unsupervised methods as well as traditional methods like 41% SUV$_\text{max}$ thresholding. We also highlight the importance of incorporating attention modules in our network for the detection of small anomalies. The code is publicly available at: https://github.com/ahxmeds/IgCONDA-PET.git.

1.2MED-PHMay 2, 2024Code

Segmentation-Free Outcome Prediction from Head and Neck Cancer PET/CT Images: Deep Learning-Based Feature Extraction from Multi-Angle Maximum Intensity Projections (MA-MIPs)

Amirhosein Toosi, Isaac Shiri, Habib Zaidi et al.

We introduce an innovative, simple, effective segmentation-free approach for outcome prediction in head \& neck cancer (HNC) patients. By harnessing deep learning-based feature extraction techniques and multi-angle maximum intensity projections (MA-MIPs) applied to Fluorodeoxyglucose Positron Emission Tomography (FDG-PET) volumes, our proposed method eliminates the need for manual segmentations of regions-of-interest (ROIs) such as primary tumors and involved lymph nodes. Instead, a state-of-the-art object detection model is trained to perform automatic cropping of the head and neck region on the PET volumes. A pre-trained deep convolutional neural network backbone is then utilized to extract deep features from MA-MIPs obtained from 72 multi-angel axial rotations of the cropped PET volumes. These deep features extracted from multiple projection views of the PET volumes are then aggregated and fused, and employed to perform recurrence-free survival analysis on a cohort of 489 HNC patients. The proposed approach outperforms the best performing method on the target dataset for the task of recurrence-free survival analysis. By circumventing the manual delineation of the malignancies on the FDG PET-CT images, our approach eliminates the dependency on subjective interpretations and highly enhances the reproducibility of the proposed survival analysis method.

6.5CVNov 25, 2024

Enhanced Lung Cancer Survival Prediction using Semi-Supervised Pseudo-Labeling and Learning from Diverse PET/CT Datasets

Mohammad R. Salmanpour, Arman Gorji, Amin Mousavi et al.

Objective: This study explores a semi-supervised learning (SSL), pseudo-labeled strategy using diverse datasets to enhance lung cancer (LCa) survival predictions, analyzing Handcrafted and Deep Radiomic Features (HRF/DRF) from PET/CT scans with Hybrid Machine Learning Systems (HMLS). Methods: We collected 199 LCa patients with both PET & CT images, obtained from The Cancer Imaging Archive (TCIA) and our local database, alongside 408 head&neck cancer (HNCa) PET/CT images from TCIA. We extracted 215 HRFs and 1024 DRFs by PySERA and a 3D-Autoencoder, respectively, within the ViSERA software, from segmented primary tumors. The supervised strategy (SL) employed a HMLSs: PCA connected with 4 classifiers on both HRF and DRFs. SSL strategy expanded the datasets by adding 408 pseudo-labeled HNCa cases (labeled by Random Forest algorithm) to 199 LCa cases, using the same HMLSs techniques. Furthermore, Principal Component Analysis (PCA) linked with 4 survival prediction algorithms were utilized in survival hazard ratio analysis. Results: SSL strategy outperformed SL method (p-value<0.05), achieving an average accuracy of 0.85 with DRFs from PET and PCA+ Multi-Layer Perceptron (MLP), compared to 0.65 for SL strategy using DRFs from CT and PCA+ K-Nearest Neighbor (KNN). Additionally, PCA linked with Component-wise Gradient Boosting Survival Analysis on both HRFs and DRFs, as extracted from CT, had an average c-index of 0.80 with a Log Rank p-value<<0.001, confirmed by external testing. Conclusions: Shifting from HRFs and SL to DRFs and SSL strategies, particularly in contexts with limited data points, enabling CT or PET alone to significantly achieve high predictive performance.

8.5IVMar 11, 2024

A cascaded deep network for automated tumor detection and segmentation in clinical PET imaging of diffuse large B-cell lymphoma

Shadab Ahamed, Natalia Dubljevic, Ingrid Bloise et al.

Accurate detection and segmentation of diffuse large B-cell lymphoma (DLBCL) from PET images has important implications for estimation of total metabolic tumor volume, radiomics analysis, surgical intervention and radiotherapy. Manual segmentation of tumors in whole-body PET images is time-consuming, labor-intensive and operator-dependent. In this work, we develop and validate a fast and efficient three-step cascaded deep learning model for automated detection and segmentation of DLBCL tumors from PET images. As compared to a single end-to-end network for segmentation of tumors in whole-body PET images, our three-step model is more effective (improves 3D Dice score from 58.9% to 78.1%) since each of its specialized modules, namely the slice classifier, the tumor detector and the tumor segmentor, can be trained independently to a high degree of skill to carry out a specific task, rather than a single network with suboptimal performance on overall segmentation.

4.3MED-PHDec 14, 2024Code

Biological and Radiological Dictionary of Radiomics Features: Addressing Understandable AI Issues in Personalized Prostate Cancer; Dictionary Version PM1.0

Mohammad R. Salmanpour, Sajad Amiri, Sara Gharibi et al.

We investigate the connection between visual semantic features defined in PI-RADS and associated risk factors, moving beyond abnormal imaging findings, establishing a shared framework between medical and AI professionals by creating a standardized dictionary of biological/radiological RFs. Subsequently, 6 interpretable and seven complex classifiers, linked with nine interpretable feature selection algorithms (FSA) applied to risk factors, were extracted from segmented lesions in T2-weighted imaging (T2WI), diffusion-weighted imaging (DWI), and apparent diffusion coefficient (ADC) multiparametric-prostate MRI sequences to predict the UCLA scores. We then utilized the created dictionary to interpret the best-predictive models. Combining T2WI, DWI, and ADC with FSAs including ANOVA F-test, Correlation Coefficient, and Fisher Score, and utilizing logistic regression, identified key features: The 90th percentile from T2WI, which captures hypo-intensity related to prostate cancer risk; Variance from T2WI, indicating lesion heterogeneity; shape metrics including Least Axis Length and Surface Area to Volume ratio from ADC, describing lesion shape and compactness; and Run Entropy from ADC, reflecting texture consistency. This approach achieved the highest average accuracy of 0.78, significantly outperforming single-sequence methods (p-value<0.05). The developed dictionary for Prostate-MRI (PM1.0) serves as a common language, fosters collaboration between clinical professionals and AI developers to advance trustworthy AI solutions that support reliable/interpretable clinical decisions.

2.3MED-PHJul 10, 2025

Robust Semi-Supervised CT Radiomics for Lung Cancer Prognosis: Cost-Effective Learning with Limited Labels and SHAP Interpretation

Mohammad R. Salmanpour, Amir Hossein Pouria, Sonia Falahati et al.

Background: CT imaging is vital for lung cancer management, offering detailed visualization for AI-based prognosis. However, supervised learning SL models require large labeled datasets, limiting their real-world application in settings with scarce annotations. Methods: We analyzed CT scans from 977 patients across 12 datasets extracting 1218 radiomics features using Laplacian of Gaussian and wavelet filters via PyRadiomics Dimensionality reduction was applied with 56 feature selection and extraction algorithms and 27 classifiers were benchmarked A semi supervised learning SSL framework with pseudo labeling utilized 478 unlabeled and 499 labeled cases Model sensitivity was tested in three scenarios varying labeled data in SL increasing unlabeled data in SSL and scaling both from 10 percent to 100 percent SHAP analysis was used to interpret predictions Cross validation and external testing in two cohorts were performed. Results: SSL outperformed SL, improving overall survival prediction by up to 17 percent. The top SSL model, Random Forest plus XGBoost classifier, achieved 0.90 accuracy in cross-validation and 0.88 externally. SHAP analysis revealed enhanced feature discriminability in both SSL and SL, especially for Class 1 survival greater than 4 years. SSL showed strong performance with only 10 percent labeled data, with more stable results compared to SL and lower variance across external testing, highlighting SSL's robustness and cost effectiveness. Conclusion: We introduced a cost-effective, stable, and interpretable SSL framework for CT-based survival prediction in lung cancer, improving performance, generalizability, and clinical readiness by integrating SHAP explainability and leveraging unlabeled data.

2.6LGNov 18, 2024

Machine Learning Evaluation Metric Discrepancies across Programming Languages and Their Components: Need for Standardization

Mohammad R. Salmanpour, Morteza Alizadeh, Ghazal Mousavi et al.

This study evaluates metrics for tasks such as classification, regression, clustering, correlation analysis, statistical tests, segmentation, and image-to-image (I2I) translation. Metrics were compared across Python libraries, R packages, and Matlab functions to assess their consistency and highlight discrepancies. The findings underscore the need for a unified roadmap to standardize metrics, ensuring reliable and reproducible ML evaluations across platforms. This study examined a wide range of evaluation metrics across various tasks and found only some to be consistent across platforms, such as (i) Accuracy, Balanced Accuracy, Cohens Kappa, F-beta Score, MCC, Geometric Mean, AUC, and Log Loss in binary classification; (ii) Accuracy, Cohens Kappa, and F-beta Score in multi-class classification; (iii) MAE, MSE, RMSE, MAPE, Explained Variance, Median AE, MSLE, and Huber in regression; (iv) Davies-Bouldin Index and Calinski-Harabasz Index in clustering; (v) Pearson, Spearman, Kendall's Tau, Mutual Information, Distance Correlation, Percbend, Shepherd, and Partial Correlation in correlation analysis; (vi) Paired t-test, Chi-Square Test, ANOVA, Kruskal-Wallis Test, Shapiro-Wilk Test, Welchs t-test, and Bartlett's test in statistical tests; (vii) Accuracy, Precision, and Recall in 2D segmentation; (viii) Accuracy in 3D segmentation; (ix) MAE, MSE, RMSE, and R-Squared in 2D-I2I translation; and (x) MAE, MSE, and RMSE in 3D-I2I translation. Given observation of discrepancies in a number of metrics (e.g. precision, recall and F1 score in binary classification, WCSS in clustering, multiple statistical tests, and IoU in segmentation, amongst multiple metrics), this study concludes that ML evaluation metrics require standardization and recommends that future research use consistent metrics for different tasks to effectively compare ML techniques and solutions.

8.5IVMar 11, 2024

A slice classification neural network for automated classification of axial PET/CT slices from a multi-centric lymphoma dataset

Shadab Ahamed, Yixi Xu, Ingrid Bloise et al.

Automated slice classification is clinically relevant since it can be incorporated into medical image segmentation workflows as a preprocessing step that would flag slices with a higher probability of containing tumors, thereby directing physicians attention to the important slices. In this work, we train a ResNet-18 network to classify axial slices of lymphoma PET/CT images (collected from two institutions) depending on whether the slice intercepted a tumor (positive slice) in the 3D image or if the slice did not (negative slice). Various instances of the network were trained on 2D axial datasets created in different ways: (i) slice-level split and (ii) patient-level split; inputs of different types were used: (i) only PET slices and (ii) concatenated PET and CT slices; and different training strategies were employed: (i) center-aware (CAW) and (ii) center-agnostic (CAG). Model performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC), and various binary classification metrics. We observe and describe a performance overestimation in the case of slice-level split as compared to the patient-level split training. The model trained using patient-level split data with the network input containing only PET slices in the CAG training regime was the best performing/generalizing model on a majority of metrics. Our models were additionally more closely compared using the sensitivity metric on the positive slices from their respective test sets.

3.6CVOct 19, 2025

Click, Predict, Trust: Clinician-in-the-Loop AI Segmentation for Lung Cancer CT-Based Prognosis within the Knowledge-to-Action Framework

Mohammad R. Salmanpour, Sonya Falahati, Amir Hossein Pouria et al.

Lung cancer remains the leading cause of cancer mortality, with CT imaging central to screening, prognosis, and treatment. Manual segmentation is variable and time-intensive, while deep learning (DL) offers automation but faces barriers to clinical adoption. Guided by the Knowledge-to-Action framework, this study develops a clinician-in-the-loop DL pipeline to enhance reproducibility, prognostic accuracy, and clinical trust. Multi-center CT data from 999 patients across 12 public datasets were analyzed using five DL models (3D Attention U-Net, ResUNet, VNet, ReconNet, SAM-Med3D), benchmarked against expert contours on whole and click-point cropped images. Segmentation reproducibility was assessed using 497 PySERA-extracted radiomic features via Spearman correlation, ICC, Wilcoxon tests, and MANOVA, while prognostic modeling compared supervised (SL) and semi-supervised learning (SSL) across 38 dimensionality reduction strategies and 24 classifiers. Six physicians qualitatively evaluated masks across seven domains, including clinical meaningfulness, boundary quality, prognostic value, trust, and workflow integration. VNet achieved the best performance (Dice = 0.83, IoU = 0.71), radiomic stability (mean correlation = 0.76, ICC = 0.65), and predictive accuracy under SSL (accuracy = 0.88, F1 = 0.83). SSL consistently outperformed SL across models. Radiologists favored VNet for peritumoral representation and smoother boundaries, preferring AI-generated initial masks for refinement rather than replacement. These results demonstrate that integrating VNet with SSL yields accurate, reproducible, and clinically trusted CT-based lung cancer prognosis, highlighting a feasible path toward physician-centered AI translation.

1.2MED-PHOct 14, 2025

What is Implementation Science; and Why It Matters for Bridging the Artificial Intelligence Innovation-to-Application Gap in Medical Imaging

Ahmad Fayaz-Bakhsh, Janice Tania, Syaheerah Lebai Lutfi et al.

The transformative potential of artificial intelligence (AI) in medical Imaging (MI) is well recognized. Yet despite promising reports in research settings, many AI tools fail to achieve clinical adoption in practice. In fact, more generally, there is a documented 17-year average delay between evidence generation and implementation of a technology1. Implementation science (IS) may provide a practical, evidence-based framework to bridge the gap between AI development and real-world clinical imaging use that helps shorten this lag through systematic frameworks, strategies, and hybrid research designs. We outline challenges specific to AI adoption in MI workflows, including infrastructural, educational, and cultural barriers. We highlight the complementary roles of effectiveness research and implementation research, emphasizing hybrid study designs and the role of integrated KT (iKT), stakeholder engagement, and equity-focused co-creation in designing sustainable and generalizable solutions. We discuss integration of Human-Computer Interaction (HCI) frameworks in MI towards usable AI. Adopting IS is not only a methodological advancement; it is a strategic imperative for accelerating translation of innovation into improved patient outcomes.

1.2MED-PHOct 14, 2025

Artificial intelligence for simplified patient-centered dosimetry in radiopharmaceutical therapies

Alejandro Lopez-Montes, Fereshteh Yousefirizi, Yizhou Chen et al.

KEY WORDS: Artificial Intelligence (AI), Theranostics, Dosimetry, Radiopharmaceutical Therapy (RPT), Patient-friendly dosimetry KEY POINTS - The rapid evolution of radiopharmaceutical therapy (RPT) highlights the growing need for personalized and patient-centered dosimetry. - Artificial Intelligence (AI) offers solutions to the key limitations in current dosimetry calculations. - The main advances on AI for simplified dosimetry toward patient-friendly RPT are reviewed. - Future directions on the role of AI in RPT dosimetry are discussed.

3.6CVSep 13, 2025

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

Sajad Amiri, Shahram Taeb, Sara Gharibi et al.

Gadolinium-based contrast agents (GBCAs) are central to glioma imaging but raise safety, cost, and accessibility concerns. Predicting contrast enhancement from non-contrast MRI using machine learning (ML) offers a safer alternative, as enhancement reflects tumor aggressiveness and informs treatment planning. Yet scanner and cohort variability hinder robust model selection. We propose a stability-aware framework to identify reproducible ML pipelines for multicenter prediction of glioma MRI contrast enhancement. We analyzed 1,446 glioma cases from four TCIA datasets (UCSF-PDGM, UPENN-GB, BRATS-Africa, BRATS-TCGA-LGG). Non-contrast T1WI served as input, with enhancement derived from paired post-contrast T1WI. Using PyRadiomics under IBSI standards, 108 features were extracted and combined with 48 dimensionality reduction methods and 25 classifiers, yielding 1,200 pipelines. Rotational validation was trained on three datasets and tested on the fourth. Cross-validation prediction accuracies ranged from 0.91 to 0.96, with external testing achieving 0.87 (UCSF-PDGM), 0.98 (UPENN-GB), and 0.95 (BRATS-Africa), with an average of 0.93. F1, precision, and recall were stable (0.87 to 0.96), while ROC-AUC varied more widely (0.50 to 0.82), reflecting cohort heterogeneity. The MI linked with ETr pipeline consistently ranked highest, balancing accuracy and stability. This framework demonstrates that stability-aware model selection enables reliable prediction of contrast enhancement from non-contrast glioma MRI, reducing reliance on GBCAs and improving generalizability across centers. It provides a scalable template for reproducible ML in neuro-oncology and beyond.

3.6CVAug 30, 2025

A Multimodal and Multi-centric Head and Neck Cancer Dataset for Segmentation, Diagnosis and Outcome Prediction

Numan Saeed, Salma Hassan, Shahad Hardan et al.

We present a publicly available multimodal dataset for head and neck cancer research, comprising 1123 annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies from patients with histologically confirmed disease, acquired from 10 international medical centers. All studies contain co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity from a long-term, multi-institution retrospective collection. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following established guidelines. We provide anonymized NifTi files, expert-annotated segmentation masks, comprehensive clinical metadata, and radiotherapy dose distributions for a patient subset. The metadata include TNM staging, HPV status, demographics, long-term follow-up outcomes, survival times, censoring indicators, and treatment information. To demonstrate its utility, we benchmark three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, using state-of-the-art deep learning models like UNet, SegResNet, and multimodal prognostic frameworks.

1.2COMP-PHJul 21, 2025

Radiological and Biological Dictionary of Radiomics Features: Addressing Understandable AI Issues in Personalized Breast Cancer; Dictionary Version BM1.0

Arman Gorji, Nima Sanati, Amir Hossein Pouria et al.

Radiomics-based AI models show promise for breast cancer diagnosis but often lack interpretability, limiting clinical adoption. This study addresses the gap between radiomic features (RF) and the standardized BI-RADS lexicon by proposing a dual-dictionary framework. First, a Clinically-Informed Feature Interpretation Dictionary (CIFID) was created by mapping 56 RFs to BI-RADS descriptors (shape, margin, internal enhancement) through literature and expert review. The framework was applied to classify triple-negative breast cancer (TNBC) versus non-TNBC using dynamic contrast-enhanced MRI from a multi-institutional cohort of 1,549 patients. We trained 27 machine learning classifiers with 27 feature selection methods. SHapley Additive exPlanations (SHAP) were used to interpret predictions and generate a complementary Data-Driven Feature Interpretation Dictionary (DDFID) for 52 additional RFs. The best model, combining Variance Inflation Factor (VIF) selection with Extra Trees Classifier, achieved an average cross-validation accuracy of 0.83. Key predictive RFs aligned with clinical knowledge: higher Sphericity (round/oval shape) and lower Busyness (more homogeneous enhancement) were associated with TNBC. The framework confirmed known imaging biomarkers and uncovered novel, interpretable associations. This dual-dictionary approach (BM1.0) enhances AI model transparency and supports the integration of RFs into routine breast cancer diagnosis and personalized care.

1.2COMP-PHMay 20, 2025

Pathobiological Dictionary Defining Pathomics and Texture Features: Addressing Understandable AI Issues in Personalized Liver Cancer; Dictionary Version LCP1.0

Mohammad R. Salmanpour, Seyed Mohammad Piri, Somayeh Sadat Mehrnia et al.

Artificial intelligence (AI) holds strong potential for medical diagnostics, yet its clinical adoption is limited by a lack of interpretability and generalizability. This study introduces the Pathobiological Dictionary for Liver Cancer (LCP1.0), a practical framework designed to translate complex Pathomics and Radiomics Features (PF and RF) into clinically meaningful insights aligned with existing diagnostic workflows. QuPath and PyRadiomics, standardized according to IBSI guidelines, were used to extract 333 imaging features from hepatocellular carcinoma (HCC) tissue samples, including 240 PF-based-cell detection/intensity, 74 RF-based texture, and 19 RF-based first-order features. Expert-defined ROIs from the public dataset excluded artifact-prone areas, and features were aggregated at the case level. Their relevance to the WHO grading system was assessed using multiple classifiers linked with feature selectors. The resulting dictionary was validated by 8 experts in oncology and pathology. In collaboration with 10 domain experts, we developed a Pathobiological dictionary of imaging features such as PFs and RF. In our study, the Variable Threshold feature selection algorithm combined with the SVM model achieved the highest accuracy (0.80, P-value less than 0.05), selecting 20 key features, primarily clinical and pathomics traits such as Centroid, Cell Nucleus, and Cytoplasmic characteristics. These features, particularly nuclear and cytoplasmic, were strongly associated with tumor grading and prognosis, reflecting atypia indicators like pleomorphism, hyperchromasia, and cellular orientation.The LCP1.0 provides a clinically validated bridge between AI outputs and expert interpretation, enhancing model transparency and usability. Aligning AI-derived features with clinical semantics supports the development of interpretable, trustworthy diagnostic tools for liver cancer pathology.

1.2MED-PHMar 1, 2025

AI-Augmented Thyroid Scintigraphy for Robust Classification

Maziar Sabouri, Ghasem Hajianfar, Alireza Rafiei Sardouei et al.

Purpose: Thyroid scintigraphy plays a vital role in diagnosing a range of thyroid disorders. While deep learning classification models hold significant promise in this domain, their effectiveness is frequently compromised by limited and imbalanced datasets. This study investigates the impact of three data augmentation strategies including Stable Diffusion (SD), Flow Matching (FM), and Conventional Augmentation (CA), on enhancing the performance of a ResNet18 classifier. Methods: Anterior thyroid scintigraphy images from 2,954 patients across nine medical centers were classified into four categories: Diffuse Goiter (DG), Nodular Goiter (NG), Normal (NL), and Thyroiditis (TI). Data augmentation was performed using various SD and FM models, resulting in 18 distinct augmentation scenarios. Each augmented dataset was used to train a ResNet18 classifier. Model performance was assessed using class-wise and average precision, recall, F1-score, AUC, and image fidelity metrics (FID and KID). Results: FM-based augmentation outperformed all other methods, achieving the highest classification accuracy and lowest FID/KID scores, indicating both improved model generalization and realistic image synthesis. SD1, combining image and prompt inputs in the inference process, was the most effective SD variant, suggesting that physician-generated prompts provide meaningful clinical context. O+FM+CA yielded the most balanced and robust performance across all classes. Conclusion: Integrating FM and clinically-informed SD augmentation, especially when guided by expert prompts, substantially improves thyroid scintigraphy classification. These findings highlight the importance of leveraging both structured medical input and advanced generative models for more effective training on limited datasets.

9.2MLJun 20, 2024

Deep Optimal Experimental Design for Parameter Estimation Problems

Md Shahriar Rahim Siddiqui, Arman Rahmim, Eldad Haber

Optimal experimental design is a well studied field in applied science and engineering. Techniques for estimating such a design are commonly used within the framework of parameter estimation. Nonetheless, in recent years parameter estimation techniques are changing rapidly with the introduction of deep learning techniques to replace traditional estimation methods. This in turn requires the adaptation of optimal experimental design that is associated with these new techniques. In this paper we investigate a new experimental design methodology that uses deep learning. We show that the training of a network as a Likelihood Free Estimator can be used to significantly simplify the design process and circumvent the need for the computationally expensive bi-level optimization problem that is inherent in optimal experimental design for non-linear systems. Furthermore, deep design improves the quality of the recovery process for parameter estimation problems. As proof of concept we apply our methodology to two different systems of Ordinary Differential Equations.

1.2MED-PHJun 3, 2024

Nuclear Medicine AI in Action: The Bethesda Report (AI Summit 2024)

Arman Rahmim, Tyler J. Bradshaw, Guido Davidzon et al.

The 2nd SNMMI Artificial Intelligence (AI) Summit, organized by the SNMMI AI Task Force, took place in Bethesda, MD, on February 29 - March 1, 2024. Bringing together various community members and stakeholders, and following up on a prior successful 2022 AI Summit, the summit theme was: AI in Action. Six key topics included (i) an overview of prior and ongoing efforts by the AI task force, (ii) emerging needs and tools for computational nuclear oncology, (iii) new frontiers in large language and generative models, (iv) defining the value proposition for the use of AI in nuclear medicine, (v) open science including efforts for data and model repositories, and (vi) issues of reimbursement and funding. The primary efforts, findings, challenges, and next steps are summarized in this manuscript.

4.3MED-PHFeb 16, 2022

Segmentation and Risk Score Prediction of Head and Neck Cancers in PET/CT Volumes with 3D U-Net and Cox Proportional Hazard Neural Networks

Fereshteh Yousefirizi, Ian Janzen, Natalia Dubljevic et al.

We utilized a 3D nnU-Net model with residual layers supplemented by squeeze and excitation (SE) normalization for tumor segmentation from PET/CT images provided by the Head and Neck Tumor segmentation chal-lenge (HECKTOR). Our proposed loss function incorporates the Unified Fo-cal and Mumford-Shah losses to take the advantage of distribution, region, and boundary-based loss functions. The results of leave-one-out-center-cross-validation performed on different centers showed a segmentation performance of 0.82 average Dice score (DSC) and 3.16 median Hausdorff Distance (HD), and our results on the test set achieved 0.77 DSC and 3.01 HD. Following lesion segmentation, we proposed training a case-control proportional hazard Cox model with an MLP neural net backbone to predict the hazard risk score for each discrete lesion. This hazard risk prediction model (CoxCC) was to be trained on a number of PET/CT radiomic features extracted from the segmented lesions, patient and lesion demographics, and encoder features provided from the penultimate layer of a multi-input 2D PET/CT convolutional neural network tasked with predicting time-to-event for each lesion. A 10-fold cross-validated CoxCC model resulted in a c-index validation score of 0.89, and a c-index score of 0.61 on the HECKTOR challenge test dataset.

4.4IVNov 2, 2021

3-D PET Image Generation with tumour masks using TGAN

Robert V Bergen, Jean-Francois Rajotte, Fereshteh Yousefirizi et al.

Training computer-vision related algorithms on medical images for disease diagnosis or image segmentation is difficult due to the lack of training data, labeled samples, and privacy concerns. For this reason, a robust generative method to create synthetic data is highly sought after. However, most three-dimensional image generators require additional image input or are extremely memory intensive. To address these issues we propose adapting video generation techniques for 3-D image generation. Using the temporal GAN (TGAN) architecture, we show we are able to generate realistic head and neck PET images. We also show that by conditioning the generator on tumour masks, we are able to control the geometry and location of the tumour in the generated images. To test the utility of the synthetic images, we train a segmentation model using the synthetic images. Synthetic images conditioned on real tumour masks are automatically segmented, and the corresponding real images are also segmented. We evaluate the segmentations using the Dice score and find the segmentation algorithm performs similarly on both datasets (0.65 synthetic data, 0.70 real data). Various radionomic features are then calculated over the segmented tumour volumes for each data set. A comparison of the real and synthetic feature distributions show that seven of eight feature distributions had statistically insignificant differences (p>0.05). Correlation coefficients were also calculated between all radionomic features and it is shown that all of the strong statistical correlations in the real data set are preserved in the synthetic data set.

3.3MED-PHOct 20, 2021

AI-Based Detection, Classification and Prediction/Prognosis in Medical Imaging: Towards Radiophenomics

Fereshteh Yousefirizi, Pierre Decazes, Amine Amyar et al.

Artificial intelligence (AI) techniques have significant potential to enable effective, robust and automated image phenotyping including identification of subtle patterns. AI-based detection searches the image space to find the regions of interest based on patterns and features. There is a spectrum of tumor histologies from benign to malignant that can be identified by AI-based classification approaches using image features. The extraction of minable information from images gives way to the field of radiomics and can be explored via explicit (handcrafted/engineered) and deep radiomics frameworks. Radiomics analysis has the potential to be utilized as a noninvasive technique for the accurate characterization of tumors to improve diagnosis and treatment monitoring. This work reviews AI-based techniques, with a special focus on oncological PET and PET/CT imaging, for different detection, classification, and prediction/prognosis tasks. We also discuss needed efforts to enable the translation of AI techniques to routine clinical workflows, and potential improvements and complementary techniques such as the use of natural language processing on electronic health records and neuro-symbolic AI techniques.

10.1AISep 3, 2021

A brief history of AI: how to prevent another winter (a critical review)

Amirhosein Toosi, Andrea Bottino, Babak Saboury et al.

The field of artificial intelligence (AI), regarded as one of the most enigmatic areas of science, has witnessed exponential growth in the past decade including a remarkably wide array of applications, having already impacted our everyday lives. Advances in computing power and the design of sophisticated AI algorithms have enabled computers to outperform humans in a variety of tasks, especially in the areas of computer vision and speech recognition. Yet, AI's path has never been smooth, having essentially fallen apart twice in its lifetime ('winters' of AI), both after periods of popular success ('summers' of AI). We provide a brief rundown of AI's evolution over the course of decades, highlighting its crucial moments and major turning points from inception to the present. In doing so, we attempt to learn, anticipate the future, and discuss what steps may be taken to prevent another 'winter'.

1.4CVJul 14, 2021

Artificial Intelligence in PET: an Industry Perspective

Arkadiusz Sitek, Sangtae Ahn, Evren Asma et al.

Artificial intelligence (AI) has significant potential to positively impact and advance medical imaging, including positron emission tomography (PET) imaging applications. AI has the ability to enhance and optimize all aspects of the PET imaging chain from patient scheduling, patient setup, protocoling, data acquisition, detector signal processing, reconstruction, image processing and interpretation. AI poses industry-specific challenges which will need to be addressed and overcome to maximize the future potentials of AI in PET. This paper provides an overview of these industry-specific challenges for the development, standardization, commercialization, and clinical adoption of AI, and explores the potential enhancements to PET imaging brought on by AI in the near future. In particular, the combination of on-demand image reconstruction, AI, and custom designed data processing workflows may open new possibilities for innovation which would positively impact the industry and ultimately patients.

3.3MED-PHJul 3, 2019

Next Generation Radiogenomics Sequencing for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients Using Multimodal Imaging and Machine Learning Approaches

Isaac Shiri, Hassan Maleki, Ghasem Hajianfar et al.

Aim: In the present work, we aimed to evaluate a comprehensive radiomics framework that enabled prediction of EGFR and KRAS mutation status in NSCLC cancer patients based on PET and CT multi-modalities radiomic features and machine learning (ML) algorithms. Methods: Our study involved 211 NSCLC cancer patient with PET and CTD images. More than twenty thousand radiomic features from different image-feature sets were extracted Feature value was normalized to obtain Z-scores, followed by student t-test students for comparison, high correlated features were eliminated and the False discovery rate (FDR) correction were performed Six feature selection methods and twelve classifiers were used to predict gene status in patient and model evaluation was reported on independent validation sets (68 patients). Results: The best predictive power of conventional PET parameters was achieved by SUVpeak (AUC: 0.69, P-value = 0.0002) and MTV (AUC: 0.55, P-value = 0.0011) for EGFR and KRAS, respectively. Univariate analysis of radiomics features improved prediction power up to AUC: 75 (q-value: 0.003, Short Run Emphasis feature of GLRLM from LOG preprocessed image of PET with sigma value 1.5) and AUC: 0.71 (q-value 0.00005, The Large Dependence Low Gray Level Emphasis from GLDM in LOG preprocessed image of CTD sigma value 5) for EGFR and KRAS, respectively. Furthermore, the machine learning algorithm improved the perdition power up to AUC: 0.82 for EGFR (LOG preprocessed of PET image set with sigma 3 with VT feature selector and SGD classifier) and AUC: 0.83 for KRAS (CT image set with sigma 3.5 with SM feature selector and SGD classifier). Conclusion: We demonstrated that radiomic features extracted from different image-feature sets could be used for EGFR and KRAS mutation status prediction in NSCLC patients, and showed that they have more predictive power than conventional imaging parameters.