Georgios Antonios Margonis

3.3APNov 3, 2023

Dimitris Bertsimas, Angelos G. Koulouras, Georgios Antonios Margonis

We propose a prognostic stratum matching framework that addresses the deficiencies of Randomized trial data subgroup analysis and transforms ObservAtional Data to be used as if they were randomized, thus paving the road for precision medicine. Our approach counters the effects of unobserved confounding in observational data by correcting the estimated probabilities of the outcome under a treatment through a novel two-step process. These probabilities are then used to train Optimal Policy Trees (OPTs), which are decision trees that optimally assign treatments to subgroups of patients based on their characteristics. This facilitates the creation of clinically intuitive treatment recommendations. We applied our framework to observational data of patients with gastrointestinal stromal tumors (GIST) and validated the OPTs in an external cohort using the sensitivity and specificity metrics. We show that these recommendations outperformed those of experts in GIST. We further applied the same framework to randomized clinical trial (RCT) data of patients with extremity sarcomas. Remarkably, despite the initial trial results suggesting that all patients should receive treatment, our framework, after addressing imbalances in patient distribution due to the trial's small sample size, identified through the OPTs a subset of patients with unique characteristics who may not require treatment. Again, we successfully validated our recommendations in an external cohort.

6.8MEApr 16

Robustifying and Selecting Cohort-Appropriate Prognostic Models under Distributional Shifts

Dimitris Bertsimas, Carol Gao, Angelos G. Koulouras et al.

External validation is widely regarded as the gold standard for prognostic model evaluation. In this study, we challenge the assumption that successful external calibration guarantees model generalizability and propose two complementary strategies to improve transportability of prognostic models across cohorts. Using six real-world surgical cohorts from tertiary academic centers, we tested whether successful external calibration depends largely on similarity in covariates and outcomes between training and validation cohorts, quantified using Kullback-Leibler (KL) divergence, with calibration assessed by the Integrated Calibration Index (ICI). From the model-developer's perspective, we trained the "best-on-average" prognostic model by tuning toward a meta-analysis-derived covariate and outcome distribution as an approximation of the broader target population. From the end-user perspective, we proposed a simple measure for cohort outcome similarity to identify, among published models, the one most suitable for a given target cohort in terms of both calibration and clinical utility. External calibration worsened as distributional mismatch increased. Higher KL divergence was associated with higher ICI in both surgery-alone (Spearman $ρ=0.614$, $p=0.004$) and surgery + adjuvant chemotherapy cohorts (Spearman $ρ=0.738$, $p<0.001$). Meta-analysis-informed weighting improved calibration in most settings without materially affecting discrimination, with the clearest benefit when evaluated on the aggregated external population ($p=0.037$). Models developed in more similar cohorts achieved lower ICI in surgery-alone (Spearman $ρ=0.803$, $p<0.001$) and surgery + adjuvant chemotherapy cohorts (Spearman $ρ=0.737$, $p<0.001$), and provided greater clinical utility on DCA.

Georgios Antonios Margonis

2 Papers