IVOct 11, 2022
Performance Deterioration of Deep Learning Models after Clinical Deployment: A Case Study with Auto-segmentation for Definitive Prostate Cancer RadiotherapyBiling Wang, Michael Dohopolski, Ti Bai et al.
We evaluated the temporal performance of a deep learning (DL) based artificial intelligence (AI) model for auto segmentation in prostate radiotherapy, seeking to correlate its efficacy with changes in clinical landscapes. Our study involved 1328 prostate cancer patients who underwent definitive radiotherapy from January 2006 to August 2022 at the University of Texas Southwestern Medical Center. We trained a UNet based segmentation model on data from 2006 to 2011 and tested it on data from 2012 to 2022 to simulate real world clinical deployment. We measured the model performance using the Dice similarity coefficient (DSC), visualized the trends in contour quality using exponentially weighted moving average (EMA) curves. Additionally, we performed Wilcoxon Rank Sum Test to analyze the differences in DSC distributions across distinct periods, and multiple linear regression to investigate the impact of various clinical factors. The model exhibited peak performance in the initial phase (from 2012 to 2014) for segmenting the prostate, rectum, and bladder. However, we observed a notable decline in performance for the prostate and rectum after 2015, while bladder contour quality remained stable. Key factors that impacted the prostate contour quality included physician contouring styles, the use of various hydrogel spacer, CT scan slice thickness, MRI-guided contouring, and using intravenous (IV) contrast. Rectum contour quality was influenced by factors such as slice thickness, physician contouring styles, and the use of various hydrogel spacers. The bladder contour quality was primarily affected by using IV contrast. This study highlights the challenges in maintaining AI model performance consistency in a dynamic clinical setting. It underscores the need for continuous monitoring and updating of AI models to ensure their ongoing effectiveness and relevance in patient care.
CVNov 19, 2022
Prior Guided Deep Difference Meta-Learner for Fast Adaptation to Stylized SegmentationAnjali Balagopal, Dan Nguyen, Ti Bai et al.
When a pre-trained general auto-segmentation model is deployed at a new institution, a support framework in the proposed Prior-guided DDL network will learn the systematic difference between the model predictions and the final contours revised and approved by clinicians for an initial group of patients. The learned style feature differences are concatenated with the new patients (query) features and then decoded to get the style-adapted segmentations. The model is independent of practice styles and anatomical structures. It meta-learns with simulated style differences and does not need to be exposed to any real clinical stylized structures during training. Once trained on the simulated data, it can be deployed for clinical use to adapt to new practice styles and new anatomical structures without further training. To show the proof of concept, we tested the Prior-guided DDL network on six different practice style variations for three different anatomical structures. Pre-trained segmentation models were adapted from post-operative clinical target volume (CTV) segmentation to segment CTVstyle1, CTVstyle2, and CTVstyle3, from parotid gland segmentation to segment Parotidsuperficial, and from rectum segmentation to segment Rectumsuperior and Rectumposterior. The mode performance was quantified with Dice Similarity Coefficient (DSC). With adaptation based on only the first three patients, the average DSCs were improved from 78.6, 71.9, 63.0, 52.2, 46.3 and 69.6 to 84.4, 77.8, 73.0, 77.8, 70.5, 68.1, for CTVstyle1, CTVstyle2, and CTVstyle3, Parotidsuperficial, Rectumsuperior, and Rectumposterior, respectively, showing the great potential of the Priorguided DDL network for a fast and effortless adaptation to new practice styles
IVFeb 3, 2023
Deep Learning (DL)-based Automatic Segmentation of the Internal Pudendal Artery (IPA) for Reduction of Erectile Dysfunction in Definitive Radiotherapy of Localized Prostate CancerAnjali Balagopal, Michael Dohopolski, Young Suk Kwon et al.
Background and purpose: Radiation-induced erectile dysfunction (RiED) is commonly seen in prostate cancer patients. Clinical trials have been developed in multiple institutions to investigate whether dose-sparing to the internal-pudendal-arteries (IPA) will improve retention of sexual potency. The IPA is usually not considered a conventional organ-at-risk (OAR) due to segmentation difficulty. In this work, we propose a deep learning (DL)-based auto-segmentation model for the IPA that utilizes CT and MRI or CT alone as the input image modality to accommodate variation in clinical practice. Materials and methods: 86 patients with CT and MRI images and noisy IPA labels were recruited in this study. We split the data into 42/14/30 for model training, testing, and a clinical observer study, respectively. There were three major innovations in this model: 1) we designed an architecture with squeeze-and-excite blocks and modality attention for effective feature extraction and production of accurate segmentation, 2) a novel loss function was used for training the model effectively with noisy labels, and 3) modality dropout strategy was used for making the model capable of segmentation in the absence of MRI. Results: The DSC, ASD, and HD95 values for the test dataset were 62.2%, 2.54mm, and 7mm, respectively. AI segmented contours were dosimetrically equivalent to the expert physician's contours. The observer study showed that expert physicians' scored AI contours (mean=3.7) higher than inexperienced physicians' contours (mean=3.1). When inexperienced physicians started with AI contours, the score improved to 3.7. Conclusion: The proposed model achieved good quality IPA contours to improve uniformity of segmentation and to facilitate introduction of standardized IPA segmentation into clinical trials and practice.
MED-PHApr 7
Spatiotemporal Gaussian representation-based dynamic reconstruction and motion estimation framework for time-resolved volumetric MR imaging (DREME-GSMR)Jiacheng Xie, Hua-Chieh Shao, Can Wu et al.
Time-resolved volumetric MR imaging that reconstructs a 3D MRI within sub-seconds to resolve deformable motion is essential for motion-adaptive radiotherapy. Representing patient anatomy and associated motion fields as 3D Gaussians, we developed a spatiotemporal Gaussian representation-based framework (DREME-GSMR), which enables time-resolved dynamic MRI reconstruction from a pre-treatment 3D MR scan without any prior anatomical/motion model. DREME-GSMR represents a reference MRI volume and a corresponding low-rank motion model (as motion-basis components) using 3D Gaussians, and incorporates a dual-path MLP/CNN motion encoder to estimate temporal motion coefficients of the motion model from raw k-space-derived signals. Furthermore, using the solved motion model, DREME-GSMR can infer motion coefficients directly from new online k-space data, allowing subsequent intra-treatment volumetric MR imaging and motion tracking (real-time imaging). A motion-augmentation strategy is further introduced to improve robustness to unseen motion patterns during real-time imaging. DREME-GSMR was evaluated on the XCAT digital phantom, a physical motion phantom, and MR-LINAC datasets acquired from 6 healthy volunteers and 20 patients (with independent sequential scans for cross-evaluation). DREME-GSMR reconstructs MRIs of a ~400ms temporal resolution, with an inference time of ~10ms/volume. In XCAT experiments, DREME-GSMR achieved mean(s.d.) SSIM, tumor center-of-mass-error(COME), and DSC of 0.92(0.01)/0.91(0.02), 0.50(0.15)/0.65(0.19) mm, and 0.92(0.02)/0.92(0.03) for dynamic reconstruction/real-time imaging. For the physical phantom, the mean target COME was 1.19(0.94)/1.40(1.15) mm for dynamic/real-time imaging, while for volunteers and patients, the mean liver COME for real-time imaging was 1.31(0.82) and 0.96(0.64) mm, respectively.
CVMay 1, 2025
AI-Assisted Decision-Making for Clinical Assessment of Auto-Segmented Contour QualityBiling Wang, Austen Maniscalco, Ti Bai et al.
Purpose: This study presents a Deep Learning (DL)-based quality assessment (QA) approach for evaluating auto-generated contours (auto-contours) in radiotherapy, with emphasis on Online Adaptive Radiotherapy (OART). Leveraging Bayesian Ordinal Classification (BOC) and calibrated uncertainty thresholds, the method enables confident QA predictions without relying on ground truth contours or extensive manual labeling. Methods: We developed a BOC model to classify auto-contour quality and quantify prediction uncertainty. A calibration step was used to optimize uncertainty thresholds that meet clinical accuracy needs. The method was validated under three data scenarios: no manual labels, limited labels, and extensive labels. For rectum contours in prostate cancer, we applied geometric surrogate labels when manual labels were absent, transfer learning when limited, and direct supervision when ample labels were available. Results: The BOC model delivered robust performance across all scenarios. Fine-tuning with just 30 manual labels and calibrating with 34 subjects yielded over 90% accuracy on test data. Using the calibrated threshold, over 93% of the auto-contours' qualities were accurately predicted in over 98% of cases, reducing unnecessary manual reviews and highlighting cases needing correction. Conclusion: The proposed QA model enhances contouring efficiency in OART by reducing manual workload and enabling fast, informed clinical decisions. Through uncertainty quantification, it ensures safer, more reliable radiotherapy workflows.
CVJul 28, 2021
A Proof-of-Concept Study of Artificial Intelligence Assisted Contour RevisionTi Bai, Anjali Balagopal, Michael Dohopolski et al.
Automatic segmentation of anatomical structures is critical for many medical applications. However, the results are not always clinically acceptable and require tedious manual revision. Here, we present a novel concept called artificial intelligence assisted contour revision (AIACR) and demonstrate its feasibility. The proposed clinical workflow of AIACR is as follows given an initial contour that requires a clinicians revision, the clinician indicates where a large revision is needed, and a trained deep learning (DL) model takes this input to update the contour. This process repeats until a clinically acceptable contour is achieved. The DL model is designed to minimize the clinicians input at each iteration and to minimize the number of iterations needed to reach acceptance. In this proof-of-concept study, we demonstrated the concept on 2D axial images of three head-and-neck cancer datasets, with the clinicians input at each iteration being one mouse click on the desired location of the contour segment. The performance of the model is quantified with Dice Similarity Coefficient (DSC) and 95th percentile of Hausdorff Distance (HD95). The average DSC/HD95 (mm) of the auto-generated initial contours were 0.82/4.3, 0.73/5.6 and 0.67/11.4 for three datasets, which were improved to 0.91/2.1, 0.86/2.4 and 0.86/4.7 with three mouse clicks, respectively. Each DL-based contour update requires around 20 ms. We proposed a novel AIACR concept that uses DL models to assist clinicians in revising contours in an efficient and effective way, and we demonstrated its feasibility by using 2D axial CT images from three head-and-neck cancer datasets.
LGJun 15, 2021
Site-Agnostic 3D Dose Distribution Prediction with Deep Learning Neural NetworksMaryam Mashayekhi, Itzel Ramirez Tapia, Anjali Balagopal et al.
Typically, the current dose prediction models are limited to small amounts of data and require re-training for a specific site, often leading to suboptimal performance. We propose a site-agnostic, 3D dose distribution prediction model using deep learning that can leverage data from any treatment site, thus increasing the total data available to train the model. Applying our proposed model to a new target treatment site requires only a brief fine-tuning of the model to the new data and involves no modifications to the model input channels or its parameters. Thus, it can be efficiently adapted to a different treatment site, even with a small training dataset.
MED-PHFeb 1, 2021
Dosimetric impact of physician style variations in contouring CTV for post-operative prostate cancer: A deep learning-based simulation studyAnjali Balagopal, Dan Nguyen, Maryam Mashayekhi et al.
Inter-observer variation is a significant problem in clinical target volume(CTV) segmentation in postoperative settings, where there is no gross tumor present. In this scenario, the CTV is not an anatomically established structure, but one determined by the physician based on the clinical guideline used, the preferred tradeoff between tumor control and toxicity, their experience and training background, and other factors. This results in high inter-observer variability between physicians. This variability has been considered an issue, but the absence of multiple physician CTV contours for each patient and the significant amount of time required for dose planning have made it impractical to study its dosimetric consequences. In this study, we analyze the impact that variations in physician style have on dose to organs-at-risk(OAR) by simulating the clinical workflow via deep learning. For a given patient previously treated by one physician, we use deep learning-based tools to simulate how other physicians would contour the CTV and how the corresponding dose distributions would look for this patient. To simulate multiple physician styles, we use a previously developed in-house CTV segmentation model that can produce physician style-aware segmentations. The corresponding dose distribution is predicted using another in-house deep learning tool, which, can predict dose within 3% of the prescription dose, on average, on the test data. For every test patient, four different physician style CTVs are considered, and four different dose distributions are analyzed. OAR dose metrics are compared, showing that even though physician style variations result in organs getting different doses, all the important dose metrics except Maximum Dose point are within the clinically acceptable limit.
MED-PHNov 1, 2020
A comparison of Monte Carlo dropout and bootstrap aggregation on the performance and uncertainty estimation in radiation therapy dose prediction with deep learning neural networksDan Nguyen, Azar Sadeghnejad Barkousaraie, Gyanendra Bohara et al.
Recently, artificial intelligence technologies and algorithms have become a major focus for advancements in treatment planning for radiation therapy. As these are starting to become incorporated into the clinical workflow, a major concern from clinicians is not whether the model is accurate, but whether the model can express to a human operator when it does not know if its answer is correct. We propose to use Monte Carlo dropout (MCDO) and the bootstrap aggregation (bagging) technique on deep learning models to produce uncertainty estimations for radiation therapy dose prediction. We show that both models are capable of generating a reasonable uncertainty map, and, with our proposed scaling technique, creating interpretable uncertainties and bounds on the prediction and any relevant metrics. Performance-wise, bagging provides statistically significant reduced loss value and errors in most of the metrics investigated in this study. The addition of bagging was able to further reduce errors by another 0.34% for Dmean and 0.19% for Dmax, on average, when compared to the baseline framework. Overall, the bagging framework provided significantly lower MAE of 2.62, as opposed to the baseline framework's MAE of 2.87. The usefulness of bagging, from solely a performance standpoint, does highly depend on the problem and the acceptable predictive error, and its high upfront computational cost during training should be factored in to deciding whether it is advantageous to use it. In terms of deployment with uncertainty estimations turned on, both frameworks offer the same performance time of about 12 seconds. As an ensemble-based metaheuristic, bagging can be used with existing machine learning architectures to improve stability and performance, and MCDO can be applied to any deep learning models that have dropout as part of their architecture.
MED-PHJun 30, 2020
Dose Prediction with Deep Learning for Prostate Cancer Radiation Therapy: Model Adaptation to Different Treatment Planning PracticesRoya Norouzi Kandalan, Dan Nguyen, Nima Hassan Rezaeian et al.
This work aims to study the generalizability of a pre-developed deep learning (DL) dose prediction model for volumetric modulated arc therapy (VMAT) for prostate cancer and to adapt the model to three different internal treatment planning styles and one external institution planning style. We built the source model with planning data from 108 patients previously treated with VMAT for prostate cancer. For the transfer learning, we selected patient cases planned with three different styles from the same institution and one style from a different institution to adapt the source model to four target models. We compared the dose distributions predicted by the source model and the target models with the clinical dose predictions and quantified the improvement in the prediction quality for the target models over the source model using the Dice similarity coefficients (DSC) of 10% to 100% isodose volumes and the dose-volume-histogram (DVH) parameters of the planning target volume and the organs-at-risk. The source model accurately predicts dose distributions for plans generated in the same source style but performs sub-optimally for the three internal and one external target styles, with the mean DSC ranging between 0.81-0.94 and 0.82-0.91 for the internal and the external styles, respectively. With transfer learning, the target model predictions improved the mean DSC to 0.88-0.95 and 0.92-0.96 for the internal and the external styles, respectively. Target model predictions significantly improved the accuracy of the DVH parameter predictions to within 1.6%. We demonstrated model generalizability for DL-based dose prediction and the feasibility of using transfer learning to solve this problem. With 14-29 cases per style, we successfully adapted the source model into several different practice styles. This indicates a realistic way to widespread clinical implementation of DL-based dose prediction.
IVApr 28, 2020
A deep learning-based framework for segmenting invisible clinical target volumes with estimated uncertainties for post-operative prostate cancer radiotherapyAnjali Balagopal, Dan Nguyen, Howard Morgan et al.
In post-operative radiotherapy for prostate cancer, the cancerous prostate gland has been surgically removed, so the clinical target volume (CTV) to be irradiated encompasses the microscopic spread of tumor cells, which cannot be visualized in typical clinical images such as computed tomography or magnetic resonance imaging. In current clinical practice, physicians segment CTVs manually based on their relationship with nearby organs and other clinical information, per clinical guidelines. Automating post-operative prostate CTV segmentation with traditional image segmentation methods has been a major challenge. Here, we propose a deep learning model to overcome this problem by segmenting nearby organs first, then using their relationship with the CTV to assist CTV segmentation. The model proposed is trained using labels clinically approved and used for patient treatment, which are subject to relatively large inter-physician variations due to the absence of a visual ground truth. The model achieves an average Dice similarity coefficient (DSC) of 0.87 on a holdout dataset of 50 patients, much better than established methods, such as atlas-based methods (DSC<0.7). The uncertainties associated with automatically segmented CTV contours are also estimated to help physicians inspect and revise the contours, especially in areas with large inter-physician variations. We also use a 4-point grading system to show that the clinical quality of the automatically segmented CTV contours is equal to that of approved clinical contours manually drawn by physicians.
MED-PHDec 17, 2018
Three-Dimensional Dose Prediction for Lung IMRT Patients with Deep Neural Networks: Robust Learning from Heterogeneous Beam ConfigurationsAna M. Barragan-Montero, Dan Nguyen, Weiguo Lu et al.
The use of neural networks to directly predict three-dimensional dose distributions for automatic planning is becoming popular. However, the existing methods only use patient anatomy as input and assume consistent beam configuration for all patients in the training database. The purpose of this work is to develop a more general model that, in addition to patient anatomy, also considers variable beam configurations, to achieve a more comprehensive automatic planning with a potentially easier clinical implementation, without the need of training specific models for different beam settings.
MED-PHMay 31, 2018
Fully Automated Organ Segmentation in Male Pelvic CT ImagesAnjali Balagopal, Samaneh Kazemifar, Dan Nguyen et al.
Accurate segmentation of prostate and surrounding organs at risk is important for prostate cancer radiotherapy treatment planning. We present a fully automated workflow for male pelvic CT image segmentation using deep learning. The architecture consists of a 2D localization network followed by a 3D segmentation network for volumetric segmentation of prostate, bladder, rectum, and femoral heads. We used a multi-channel 2D U-Net followed by a 3D U-Net with encoding arm modified with aggregated residual networks, known as ResNeXt. The models were trained and tested on a pelvic CT image dataset comprising 136 patients. Test results show that 3D U-Net based segmentation achieves mean (SD) Dice coefficient values of 90 (2.0)% ,96 (3.0)%, 95 (1.3)%, 95 (1.5)%, and 84 (3.7)% for prostate, left femoral head, right femoral head, bladder, and rectum, respectively, using the proposed fully automated segmentation method.
MED-PHMay 25, 2018
Three-Dimensional Radiotherapy Dose Prediction on Head and Neck Cancer Patients with a Hierarchically Densely Connected U-net Deep Learning ArchitectureDan Nguyen, Xun Jia, David Sher et al.
The treatment planning process for patients with head and neck (H&N) cancer is regarded as one of the most complicated due to large target volume, multiple prescription dose levels, and many radiation-sensitive critical structures near the target. Treatment planning for this site requires a high level of human expertise and a tremendous amount of effort to produce personalized high quality plans, taking as long as a week, which deteriorates the chances of tumor control and patient survival. To solve this problem, we propose to investigate a deep learning-based dose prediction model, Hierarchically Densely Connected U-net, based on two highly popular network architectures: U-net and DenseNet. We find that this new architecture is able to accurately and efficiently predict the dose distribution, outperforming the other two models, the Standard U-net and DenseNet, in homogeneity, dose conformity, and dose coverage on the test data. Averaging across all organs at risk, our proposed model is capable of predicting the organ-at-risk max dose within 6.3% and mean dose within 5.1% of the prescription dose on the test data. The other models, the Standard U-net and DenseNet, performed worse, having an averaged organ-at-risk max dose prediction error of 8.2% and 9.3%, respectively, and averaged mean dose prediction error of 6.4% and 6.8%, respectively. In addition, our proposed model used 12 times less trainable parameters than the Standard U-net, and predicted the patient dose 4 times faster than DenseNet.