Development and Validation of a Deep Learning-Based Microsatellite Instability Predictor from Prostate Cancer Whole-Slide Images
This work addresses the low routine testing of MSI status in prostate cancer patients, potentially directing them toward immunotherapy, though it is incremental as it applies an existing deep learning method to a new medical imaging domain.
The researchers tackled the problem of predicting microsatellite instability-high (MSI-H) status in prostate cancer from routine H&E whole-slide images, achieving area under the ROC curve values of 0.78, 0.72, and 0.72 on internal, external, and temporal validation sets.
Microsatellite instability-high (MSI-H) is a tumor agnostic biomarker for immune checkpoint inhibitor therapy. However, MSI status is not routinely tested in prostate cancer, in part due to low prevalence and assay cost. As such, prediction of MSI status from hematoxylin and eosin (H&E) stained whole-slide images (WSIs) could identify prostate cancer patients most likely to benefit from confirmatory testing and becoming eligible for immunotherapy. Prostate biopsies and surgical resections from de-identified records of consecutive prostate cancer patients referred to our institution were analyzed. Their MSI status was determined by next generation sequencing. Patients before a cutoff date were split into an algorithm development set (n=4015, MSI-H 1.8%) and a paired validation set (n=173, MSI-H 19.7%) that consisted of two serial sections from each sample, one stained and scanned internally and the other at an external site. Patients after the cutoff date formed the temporal validation set (n=1350, MSI-H 2.3%). Attention-based multiple instance learning models were trained to predict MSI-H from H&E WSIs. The MSI-H predictor achieved area under the receiver operating characteristic curve values of 0.78 (95% CI [0.69-0.86]), 0.72 (95% CI [0.63-0.81]), and 0.72 (95% CI [0.62-0.82]) on the internally prepared, externally prepared, and temporal validation sets, respectively. While MSI-H status is significantly correlated with Gleason score, the model remained predictive within each Gleason score subgroup. In summary, we developed and validated an AI-based MSI-H diagnostic model on a large real-world cohort of routine H&E slides, which effectively generalized to externally stained and scanned samples and a temporally independent validation cohort. This algorithm has the potential to direct prostate cancer patients toward immunotherapy and to identify MSI-H cases secondary to Lynch syndrome.