Joon Lee

CV
7papers
3,094citations
Novelty29%
AI Score24

7 Papers

IVJun 13, 2022
Prostate Cancer Malignancy Detection and localization from mpMRI using auto-Deep Learning: One Step Closer to Clinical Utilization

Weiwei Zong, Eric Carver, Simeng Zhu et al.

Automatic diagnosis of malignant prostate cancer patients from mpMRI has been studied heavily in the past years. Model interpretation and domain drift have been the main road blocks for clinical utilization. As an extension from our previous work where we trained a customized convolutional neural network on a public cohort with 201 patients and the cropped 2D patches around the region of interest were used as the input, the cropped 2.5D slices of the prostate glands were used as the input, and the optimal model were searched in the model space using autoKeras. Something different was peripheral zone (PZ) and central gland (CG) were trained and tested separately, the PZ detector and CG detector were demonstrated effectively in highlighting the most suspicious slices out of a sequence, hopefully to greatly ease the workload for the physicians.

LGMar 25, 2023
Unsupervised Feature Selection to Identify Important ICD-10 Codes for Machine Learning: A Case Study on a Coronary Artery Disease Patient Cohort

Peyman Ghasemi, Joon Lee

The use of International Classification of Diseases (ICD) codes in healthcare presents a challenge in selecting relevant codes as features for machine learning models due to this system's large number of codes. In this study, we compared several unsupervised feature selection methods for an ICD code database of 49,075 coronary artery disease patients in Alberta, Canada. Specifically, we employed Laplacian Score, Unsupervised Feature Selection for Multi-Cluster Data, Autoencoder Inspired Unsupervised Feature Selection, Principal Feature Analysis, and Concrete Autoencoders with and without ICD tree weight adjustment to select the 100 best features from over 9,000 codes. We assessed the selected features based on their ability to reconstruct the initial feature space and predict 90-day mortality following discharge. Our findings revealed that the Concrete Autoencoder methods outperformed all other methods in both tasks. Furthermore, the weight adjustment in the Concrete Autoencoder method decreased the complexity of features.

LGFeb 9, 2023
Machine Learning Capability: A standardized metric using case difficulty with applications to individualized deployment of supervised machine learning

Adrienne Kline, Joon Lee

Model evaluation is a critical component in supervised machine learning classification analyses. Traditional metrics do not currently incorporate case difficulty. This renders the classification results unbenchmarked for generalization. Item Response Theory (IRT) and Computer Adaptive Testing (CAT) with machine learning can benchmark datasets independent of the end-classification results. This provides high levels of case-level information regarding evaluation utility. To showcase, two datasets were used: 1) health-related and 2) physical science. For the health dataset a two-parameter IRT model, and for the physical science dataset a polytonomous IRT model, was used to analyze predictive features and place each case on a difficulty continuum. A CAT approach was used to ascertain the algorithms' performance and applicability to new data. This method provides an efficient way to benchmark data, using only a fraction of the dataset (less than 1%) and 22-60x more computationally efficient than traditional metrics. This novel metric, termed Machine Learning Capability (MLC) has additional benefits as it is unbiased to outcome classification and a standardized way to make model comparisons within and across datasets. MLC provides a metric on the limitation of supervised machine learning algorithms. In situations where the algorithm falls short, other input(s) are required for decision-making.

CVApr 4, 2019
Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN

Zhenzhen Dai, Eric Carver, Chang Liu et al.

Prostate cancer (PCa) is the most common cancer in men in the United States. Multiparametic magnetic resonance imaging (mp-MRI) has been explored by many researchers to targeted prostate biopsies and radiation therapy. However, assessment on mp-MRI can be subjective, development of computer-aided diagnosis systems to automatically delineate the prostate gland and the intraprostratic lesions (ILs) becomes important to facilitate with radiologists in clinical practice. In this paper, we first study the implementation of the Mask-RCNN model to segment the prostate and ILs. We trained and evaluated models on 120 patients from two different cohorts of patients. We also used 2D U-Net and 3D U-Net as benchmarks to segment the prostate and compared the model's performance. The contour variability of ILs using the algorithm was also benchmarked against the interobserver variability between two different radiation oncologists on 19 patients. Our results indicate that the Mask-RCNN model is able to reach state-of-art performance in the prostate segmentation and outperforms several competitive baselines in ILs segmentation.

CVMar 29, 2019
A Deep Dive into Understanding Tumor Foci Classification using Multiparametric MRI Based on Convolutional Neural Network

Weiwei Zong, Joon Lee, Chang Liu et al.

Deep learning models have had a great success in disease classifications using large data pools of skin cancer images or lung X-rays. However, data scarcity has been the roadblock of applying deep learning models directly on prostate multiparametric MRI (mpMRI). Although model interpretation has been heavily studied for natural images for the past few years, there has been a lack of interpretation of deep learning models trained on medical images. This work designs a customized workflow for the small and imbalanced data set of prostate mpMRI where features were extracted from a deep learning model and then analyzed by a traditional machine learning classifier. In addition, this work contributes to revealing how deep learning models interpret mpMRI for prostate cancer patients stratification.

CVNov 5, 2018
Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Spyridon Bakas, Mauricio Reyes, Andras Jakab et al.

Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.

CLJul 26, 2017
Video Highlight Prediction Using Audience Chat Reactions

Cheng-Yang Fu, Joon Lee, Mohit Bansal et al.

Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.