Khashayar Namdar

CV
h-index78
17papers
284citations
Novelty30%
AI Score36

17 Papers

QMJul 29, 2022Code
Open-radiomics: A Collection of Standardized Datasets and a Technical Protocol for Reproducible Radiomics Machine Learning Pipelines

Khashayar Namdar, Matthias W. Wagner, Birgit B. Ertl-Wagner et al. · utoronto

Background: As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to improve the reproducibility of the results. Methods: We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase classification, and non-small cell lung cancer survival analysis from the Cancer Imaging Archive. Using BraTS 2020 Magnetic Resonance Imaging (MRI) dataset, we applied our protocol to 369 brain tumor patients (76 LGG, 293 HGG). Leveraging PyRadiomics for LGG vs. HGG classification, we generated 288 datasets from 4 MRI sequences, 3 binWidths, 6 normalization methods, and 4 tumor subregions. Random Forest classifiers were trained and validated (60%,20%,20%) across 100 different data splits (28,800 test results), evaluating Area Under the Receiver Operating Characteristic Curve (AUROC). Results: Unlike binWidth and image normalization, tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of Necrotic and the non-enhancing tumor core subregions resulted in the highest AUROCs (average test AUROC 0.951, 95% confidence interval of (0.949, 0.952)). Although several settings and data splits (28 out of 28800) yielded test AUROC of 1, they were irreproducible. Conclusion: Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible.

CVAug 22, 2022Code
Minimizing the Effect of Noise and Limited Dataset Size in Image Classification Using Depth Estimation as an Auxiliary Task with Deep Multitask Learning

Khashayar Namdar, Partoo Vafaeikia, Farzad Khalvati · utoronto

Generalizability is the ultimate goal of Machine Learning (ML) image classifiers, for which noise and limited dataset size are among the major concerns. We tackle these challenges through utilizing the framework of deep Multitask Learning (dMTL) and incorporating image depth estimation as an auxiliary task. On a customized and depth-augmented derivation of the MNIST dataset, we show a) multitask loss functions are the most effective approach of implementing dMTL, b) limited dataset size primarily contributes to classification inaccuracy, and c) depth estimation is mostly impacted by noise. In order to further validate the results, we manually labeled the NYU Depth V2 dataset for scene classification tasks. As a contribution to the field, we have made the data in python native format publicly available as an open-source dataset and provided the scene labels. Our experiments on MNIST and NYU-Depth-V2 show dMTL improves generalizability of the classifiers when the dataset is noisy and the number of examples is limited.

CVOct 13, 2022
Improving Deep Learning Models for Pediatric Low-Grade Glioma Tumors Molecular Subtype Identification Using 3D Probability Distributions of Tumor Location

Khashayar Namdar, Matthias W. Wagner, Kareem Kudus et al. · utoronto

Background and Purpose: Pediatric low-grade glioma (pLGG) is the most common type of brain tumor in children, and identification of molecular markers for pLGG is crucial for successful treatment planning. Convolutional Neural Network (CNN) models for pLGG subtype identification rely on tumor segmentation. We hypothesize tumor segmentations are suboptimal and thus, we propose to augment the CNN models using tumor location probability in MRI data. Materials and Methods: Our REB-approved retrospective study included MRI Fluid-Attenuated Inversion Recovery (FLAIR) sequences of 143 BRAF fused and 71 BRAF V600E mutated tumors. Tumor segmentations (regions of interest (ROIs)) were provided by a pediatric neuroradiology fellow and verified by a senior pediatric neuroradiologist. In each experiment, we randomly split the data into development and test with an 80/20 ratio. We combined the 3D binary ROI masks for each class in the development dataset to derive the probability density functions (PDF) of tumor location, and developed three pipelines: location-based, CNN-based, and hybrid. Results: We repeated the experiment with different model initializations and data splits 100 times and calculated the Area Under Receiver Operating Characteristic Curve (AUC). The location-based classifier achieved an AUC of 77.90, 95% confidence interval (CI) (76.76, 79.03). CNN-based classifiers achieved AUC of 86.11, CI (84.96, 87.25), while the tumor-location-guided CNNs outperformed the formers with an average AUC of 88.64 CI (87.57, 89.72), which was statistically significant (Student's t-test p-value 0.0018). Conclusion: We achieved statistically significant improvements by incorporating tumor location into the CNN models. Our results suggest that manually segmented ROIs may not be optimal.

IVNov 25, 2022
Automating Cobb Angle Measurement for Adolescent Idiopathic Scoliosis using Instance Segmentation

Chaojun Chen, Khashayar Namdar, Yujie Wu et al. · utoronto

Scoliosis is a three-dimensional deformity of the spine, most often diagnosed in childhood. It affects 2-3% of the population, which is approximately seven million people in North America. Currently, the reference standard for assessing scoliosis is based on the manual assignment of Cobb angles at the site of the curvature center. This manual process is time consuming and unreliable as it is affected by inter- and intra-observer variance. To overcome these inaccuracies, machine learning (ML) methods can be used to automate the Cobb angle measurement process. This paper proposes to address the Cobb angle measurement task using YOLACT, an instance segmentation model. The proposed method first segments the vertebrae in an X-Ray image using YOLACT, then it tracks the important landmarks using the minimum bounding box approach. Lastly, the extracted landmarks are used to calculate the corresponding Cobb angles. The model achieved a Symmetric Mean Absolute Percentage Error (SMAPE) score of 10.76%, demonstrating the reliability of this process in both vertebra localization and Cobb angle measurement.

IVNov 10, 2022
Generative Adversarial Networks for Weakly Supervised Generation and Evaluation of Brain Tumor Segmentations on MR Images

Jay J. Yoo, Khashayar Namdar, Matthias W. Wagner et al. · utoronto

Segmentation of regions of interest (ROIs) for identifying abnormalities is a leading problem in medical imaging. Using machine learning for this problem generally requires manually annotated ground-truth segmentations, demanding extensive time and resources from radiologists. This work presents a weakly supervised approach that utilizes binary image-level labels, which are much simpler to acquire, to effectively segment anomalies in 2D magnetic resonance images without ground truth annotations. We train a generative adversarial network (GAN) that converts cancerous images to healthy variants, which are used along with localization seeds as priors to generate improved weakly supervised segmentations. The non-cancerous variants can also be used to evaluate the segmentations in a weakly supervised fashion, which allows for the most effective segmentations to be identified and then applied to downstream clinical classification tasks. On the Multimodal Brain Tumor Segmentation (BraTS) 2020 dataset, our proposed method generates and identifies segmentations that achieve test Dice coefficients of 83.91%. Using these segmentations for pathology classification results with a test AUC of 93.32% which is comparable to the test AUC of 95.80% achieved when using true segmentations.

CVNov 25, 2022
Non-invasive Liver Fibrosis Screening on CT Images using Radiomics

Jay J. Yoo, Khashayar Namdar, Sean Carey et al. · utoronto

Objectives: To develop and evaluate a radiomics machine learning model for detecting liver fibrosis on CT of the liver. Methods: For this retrospective, single-centre study, radiomic features were extracted from Regions of Interest (ROIs) on CT images of patients who underwent simultaneous liver biopsy and CT examinations. Combinations of contrast, normalization, machine learning model, and feature selection method were determined based on their mean test Area Under the Receiver Operating Characteristic curve (AUC) on randomly placed ROIs. The combination and selected features with the highest AUC were used to develop a final liver fibrosis screening model. Results: The study included 101 male and 68 female patients (mean age = 51.2 years $\pm$ 14.7 [SD]). When averaging the AUC across all combinations, non-contrast enhanced (NC) CT (AUC, 0.6100; 95% CI: 0.5897, 0.6303) outperformed contrast-enhanced CT (AUC, 0.5680; 95% CI: 0.5471, 0.5890). The combination of hyperparameters and features that yielded the highest AUC was a logistic regression model with inputs features of maximum, energy, kurtosis, skewness, and small area high gray level emphasis extracted from non-contrast enhanced NC CT normalized using Gamma correction with $γ$ = 1.5 (AUC, 0.7833; 95% CI: 0.7821, 0.7845), (sensitivity, 0.9091; 95% CI: 0.9091, 0.9091). Conclusions: Radiomics-based machine learning models allow for the detection of liver fibrosis with reasonable accuracy and high sensitivity on NC CT. Thus, these models can be used to non-invasively screen for liver fibrosis, contributing to earlier detection of the disease at a potentially curable stage.

CVSep 20, 2022
Deep Superpixel Generation and Clustering for Weakly Supervised Segmentation of Brain Tumors in MR Images

Jay J. Yoo, Khashayar Namdar, Farzad Khalvati · utoronto

Training machine learning models to segment tumors and other anomalies in medical images is an important step for developing diagnostic tools but generally requires manually annotated ground truth segmentations, which necessitates significant time and resources. This work proposes the use of a superpixel generation model and a superpixel clustering model to enable weakly supervised brain tumor segmentations. The proposed method utilizes binary image-level classification labels, which are readily accessible, to significantly improve the initial region of interest segmentations generated by standard weakly supervised methods without requiring ground truth annotations. We used 2D slices of magnetic resonance brain scans from the Multimodal Brain Tumor Segmentation Challenge 2020 dataset and labels indicating the presence of tumors to train the pipeline. On the test cohort, our method achieved a mean Dice coefficient of 0.691 and a mean 95% Hausdorff distance of 18.1, outperforming existing superpixel-based weakly supervised segmentation methods.

IVFeb 5, 2024
Improving Pediatric Low-Grade Neuroepithelial Tumors Molecular Subtype Identification Using a Novel AUROC Loss Function for Convolutional Neural Networks

Khashayar Namdar, Matthias W. Wagner, Cynthia Hawkins et al.

Pediatric Low-Grade Neuroepithelial Tumors (PLGNT) are the most common pediatric cancer type, accounting for 40% of brain tumors in children, and identifying PLGNT molecular subtype is crucial for treatment planning. However, the gold standard to determine the PLGNT subtype is biopsy, which can be impractical or dangerous for patients. This research improves the performance of Convolutional Neural Networks (CNNs) in classifying PLGNT subtypes through MRI scans by introducing a loss function that specifically improves the model's Area Under the Receiver Operating Characteristic (ROC) Curve (AUROC), offering a non-invasive diagnostic alternative. In this study, a retrospective dataset of 339 children with PLGNT (143 BRAF fusion, 71 with BRAF V600E mutation, and 125 non-BRAF) was curated. We employed a CNN model with Monte Carlo random data splitting. The baseline model was trained using binary cross entropy (BCE), and achieved an AUROC of 86.11% for differentiating BRAF fusion and BRAF V600E mutations, which was improved to 87.71% using our proposed AUROC loss function (p-value 0.045). With multiclass classification, the AUROC improved from 74.42% to 76. 59% (p-value 0.0016).

CLMay 1, 2025
Red Teaming Large Language Models for Healthcare

Vahid Balazadeh, Michael Cooper, David Pellow et al. · utoronto

We present the design process and findings of the pre-conference workshop at the Machine Learning for Healthcare Conference (2024) entitled Red Teaming Large Language Models for Healthcare, which took place on August 15, 2024. Conference participants, comprising a mix of computational and clinical expertise, attempted to discover vulnerabilities -- realistic clinical prompts for which a large language model (LLM) outputs a response that could cause clinical harm. Red-teaming with clinicians enables the identification of LLM vulnerabilities that may not be recognised by LLM developers lacking clinical expertise. We report the vulnerabilities found, categorise them, and present the results of a replication study assessing the vulnerabilities across all LLMs provided.

CLMay 10, 2024
Opportunities for Persian Digital Humanities Research with Artificial Intelligence Language Models; Case Study: Forough Farrokhzad

Arash Rasti Meymandi, Zahra Hosseini, Sina Davari et al. · utoronto

This study explores the integration of advanced Natural Language Processing (NLP) and Artificial Intelligence (AI) techniques to analyze and interpret Persian literature, focusing on the poetry of Forough Farrokhzad. Utilizing computational methods, we aim to unveil thematic, stylistic, and linguistic patterns in Persian poetry. Specifically, the study employs AI models including transformer-based language models for clustering of the poems in an unsupervised framework. This research underscores the potential of AI in enhancing our understanding of Persian literary heritage, with Forough Farrokhzad's work providing a comprehensive case study. This approach not only contributes to the field of Persian Digital Humanities but also sets a precedent for future research in Persian literary studies using computational techniques.

AISep 11, 2025
Anti-Money Laundering Machine Learning Pipelines; A Technical Analysis on Identifying High-risk Bank Clients with Supervised Learning

Khashayar Namdar, Pin-Chien Wang, Tushar Raju et al. · utoronto

Anti-money laundering (AML) actions and measurements are among the priorities of financial institutions, for which machine learning (ML) has shown to have a high potential. In this paper, we propose a comprehensive and systematic approach for developing ML pipelines to identify high-risk bank clients in a dataset curated for Task 1 of the University of Toronto 2023-2024 Institute for Management and Innovation (IMI) Big Data and Artificial Intelligence Competition. The dataset included 195,789 customer IDs, and we employed a 16-step design and statistical analysis to ensure the final pipeline was robust. We also framed the data in a SQLite database, developed SQL-based feature engineering algorithms, connected our pre-trained model to the database, and made it inference-ready, and provided explainable artificial intelligence (XAI) modules to derive feature importance. Our pipeline achieved a mean area under the receiver operating characteristic curve (AUROC) of 0.961 with a standard deviation (SD) of 0.005. The proposed pipeline achieved second place in the competition.

CVJun 25, 2025
AI-Driven MRI-based Brain Tumour Segmentation Benchmarking

Connor Ludwig, Khashayar Namdar, Farzad Khalvati · utoronto

Medical image segmentation has greatly aided medical diagnosis, with U-Net based architectures and nnU-Net providing state-of-the-art performance. There have been numerous general promptable models and medical variations introduced in recent years, but there is currently a lack of evaluation and comparison of these models across a variety of prompt qualities on a common medical dataset. This research uses Segment Anything Model (SAM), Segment Anything Model 2 (SAM 2), MedSAM, SAM-Med-3D, and nnU-Net to obtain zero-shot inference on the BraTS 2023 adult glioma and pediatrics dataset across multiple prompt qualities for both points and bounding boxes. Several of these models exhibit promising Dice scores, particularly SAM and SAM 2 achieving scores of up to 0.894 and 0.893, respectively when given extremely accurate bounding box prompts which exceeds nnU-Net's segmentation performance. However, nnU-Net remains the dominant medical image segmentation network due to the impracticality of providing highly accurate prompts to the models. The model and prompt evaluation, as well as the comparison, are extended through fine-tuning SAM, SAM 2, MedSAM, and SAM-Med-3D on the pediatrics dataset. The improvements in point prompt performance after fine-tuning are substantial and show promise for future investigation, but are unable to achieve better segmentation than bounding boxes or nnU-Net.

IVNov 16, 2020
A Transfer Learning Based Active Learning Framework for Brain Tumor Classification

Ruqian Hao, Khashayar Namdar, Lin Liu et al.

Brain tumor is one of the leading causes of cancer-related death globally among children and adults. Precise classification of brain tumor grade (low-grade and high-grade glioma) at early stage plays a key role in successful prognosis and treatment planning. With recent advances in deep learning, Artificial Intelligence-enabled brain tumor grading systems can assist radiologists in the interpretation of medical images within seconds. The performance of deep learning techniques is, however, highly depended on the size of the annotated dataset. It is extremely challenging to label a large quantity of medical images given the complexity and volume of medical data. In this work, we propose a novel transfer learning based active learning framework to reduce the annotation cost while maintaining stability and robustness of the model performance for brain tumor classification. We employed a 2D slice-based approach to train and finetune our model on the Magnetic Resonance Imaging (MRI) training dataset of 203 patients and a validation dataset of 66 patients which was used as the baseline. With our proposed method, the model achieved Area Under Receiver Operating Characteristic (ROC) Curve (AUC) of 82.89% on a separate test dataset of 66 patients, which was 2.92% higher than the baseline AUC while saving at least 40% of labeling cost. In order to further examine the robustness of our method, we created a balanced dataset, which underwent the same procedure. The model achieved AUC of 82% compared with AUC of 78.48% for the baseline, which reassures the robustness and stability of our proposed transfer learning augmented with active learning framework while significantly reducing the size of training data.

LGJul 2, 2020
A Brief Review of Deep Multi-task Learning and Auxiliary Task Learning

Partoo Vafaeikia, Khashayar Namdar, Farzad Khalvati

Multi-task learning (MTL) optimizes several learning tasks simultaneously and leverages their shared information to improve generalization and the prediction of the model for each task. Auxiliary tasks can be added to the main task to ultimately boost the performance. In this paper, we provide a brief review on the recent deep multi-task learning (dMTL) approaches followed by methods on selecting useful auxiliary tasks that can be used in dMTL to improve the performance of the model for the main task.

LGJun 8, 2020
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account

Khashayar Namdar, Masoom A. Haider, Farzad Khalvati

Receiver operating characteristic (ROC) curve is an informative tool in binary classification and Area Under ROC Curve (AUC) is a popular metric for reporting performance of binary classifiers. In this paper, first we present a comprehensive review of ROC curve and AUC metric. Next, we propose a modified version of AUC that takes confidence of the model into account and at the same time, incorporates AUC into Binary Cross Entropy (BCE) loss used for training a Convolutional neural Network for classification tasks. We demonstrate this on three datasets: MNIST, prostate MRI, and brain MRI. Furthermore, we have published GenuineAI, a new python library, which provides the functions for conventional AUC and the proposed modified AUC along with metrics including sensitivity, specificity, recall, precision, and F1 for each point of the ROC curve.

QMJun 1, 2020
A Comprehensive Study of Data Augmentation Strategies for Prostate Cancer Detection in Diffusion-weighted MRI using Convolutional Neural Networks

Ruqian Hao, Khashayar Namdar, Lin Liu et al.

Data augmentation refers to a group of techniques whose goal is to battle limited amount of available data to improve model generalization and push sample distribution toward the true distribution. While different augmentation strategies and their combinations have been investigated for various computer vision tasks in the context of deep learning, a specific work in the domain of medical imaging is rare and to the best of our knowledge, there has been no dedicated work on exploring the effects of various augmentation methods on the performance of deep learning models in prostate cancer detection. In this work, we have statically applied five most frequently used augmentation techniques (random rotation, horizontal flip, vertical flip, random crop, and translation) to prostate Diffusion-weighted Magnetic Resonance Imaging training dataset of 217 patients separately and evaluated the effect of each method on the accuracy of prostate cancer detection. The augmentation algorithms were applied independently to each data channel and a shallow as well as a deep Convolutional Neural Network (CNN) were trained on the five augmented sets separately. We used Area Under Receiver Operating Characteristic (ROC) curve (AUC) to evaluate the performance of the trained CNNs on a separate test set of 95 patients, using a validation set of 102 patients for finetuning. The shallow network outperformed the deep network with the best 2D slice-based AUC of 0.85 obtained by the rotation method.

IVNov 4, 2019
Evolution-based Fine-tuning of CNNs for Prostate Cancer Detection

Khashayar Namdar, Isha Gujrathi, Masoom A. Haider et al.

Convolutional Neural Networks (CNNs) have been used for automated detection of prostate cancer where Area Under Receiver Operating Characteristic (ROC) curve (AUC) is usually used as the performance metric. Given that AUC is not differentiable, common practice is to train the CNN using a loss functions based on other performance metrics such as cross entropy and monitoring AUC to select the best model. In this work, we propose to fine-tune a trained CNN for prostate cancer detection using a Genetic Algorithm to achieve a higher AUC. Our dataset contained 6-channel Diffusion-Weighted MRI slices of prostate. On a cohort of 2,955 training, 1,417 validation, and 1,334 test slices, we reached test AUC of 0.773; a 9.3% improvement compared to the base CNN model.