LGMay 4

Gradient Boosted Risk Scores

arXiv:2605.025937.6

AI Analysis

This work provides a practical tool for generating interpretable risk scores in high-stakes domains like medicine, offering a balance between predictive performance and model simplicity.

The authors propose a gradient boosting-based method for building compact and interpretable risk scores that can model nonlinear effects. Their approach achieves competitive predictive performance while producing 60% fewer rules for classification and 16% fewer rules for time-to-event tasks compared to AutoScore.

Risk scores are an interpretable and actionable class of machine learning models with applications in medicine, insurance, and risk management. Unlike most computational methods, risk scores are designed to be computed by a human by attributing points to a data sample based on a limited set of criteria. The most common approaches for generating risk scores use linear regressions to estimate the effect of selected variables. We propose a simple and effective approach towards building compact and predictive risk scores. We provide an algorithm based on gradient boosting that is capable of modeling nonlinear effects, along with a C++ implementation with Python and R bindings. Through extensive empirical evaluation on twelve tabular datasets spanning regression, classification, and time-to-event tasks, we show that our method achieves competitive predictive performance while producing substantially more compact scores than regression-based alternatives, with 60% fewer rules for classification tasks and 16% fewer rules for time-to-event tasks on average, compared to AutoScore.

View on arXiv PDF

Similar