Learning Interpretable Point-Based Clinical Risk Scores via Direct Optimization

Ying Cui, Albert M Li, Vivek Charu, Yeon-Mi Hwang, Tina Hernandez-Boussard, Lu Tian

arXiv:2605.1911322.3

Predicted impact top 47% in ME · last 90 daysOriginality Incremental advance

AI Analysis

For clinical practitioners, this provides a computationally efficient way to generate interpretable, integer-weighted risk scores that are optimal under explicit objectives, addressing the gap between rounding-based methods and computationally heavy integer programming.

The paper develops greedy optimization algorithms to directly learn integer-weighted clinical risk scores, avoiding suboptimal rounding from regression models. Applied to a large EHR cohort, the method constructs a comorbidity score for post-discharge mortality risk, with simulation studies validating finite-sample performance.

Many clinical risk scores are deployed as additive rules with nonnegative integer points assigned to relevant binary predictive features. These integer weights not only make the score easier to use in practice but also promote sparsity in the resulting prediction model. Such risk scores are often derived by first fitting a regression model and then rounding the estimated coefficients to the nearest integer after appropriate scaling. This approach is computationally fast but does not guarantee optimality of the resulting score. Alternatively, one may search over all possible integer weights to directly optimize a value function by posing the problem as an integer programming task. However, the associated computational burden can be substantial, especially when the value function is nonconcave or even discontinuous. In this paper, we develop new machine learning algorithms that employ a flexible greedy optimization strategy to learn such additive scoring directly under explicit and sensible optimality objectives. We apply the proposed method to a large electronic health record (EHR) cohort in Epic Cosmos to construct an integer-weighted comorbidity score for measuring the risk of post-discharge mortality. We also conduct a simulation study to examine the finite-sample operating characteristics.

View on arXiv PDF

Similar