FasterRisk: Fast and Accurate Interpretable Risk Scores
This addresses the need for fast and accurate predictive models in domains like healthcare and criminal justice, offering an incremental improvement over existing computational methods.
The authors tackled the problem of creating high-quality, interpretable risk scores from data by introducing FasterRisk, a method that efficiently generates a collection of sparse linear models with integer coefficients, completing within minutes and producing scores with low logistic loss.
Over the last century, risk scores have been the most popular form of predictive model used in healthcare and criminal justice. Risk scores are sparse linear models with integer coefficients; often these models can be memorized or placed on an index card. Typically, risk scores have been created either without data or by rounding logistic regression coefficients, but these methods do not reliably produce high-quality risk scores. Recent work used mathematical programming, which is computationally slow. We introduce an approach for efficiently producing a collection of high-quality risk scores learned from data. Specifically, our approach produces a pool of almost-optimal sparse continuous solutions, each with a different support set, using a beam-search algorithm. Each of these continuous solutions is transformed into a separate risk score through a "star ray" search, where a range of multipliers are considered before rounding the coefficients sequentially to maintain low logistic loss. Our algorithm returns all of these high-quality risk scores for the user to consider. This method completes within minutes and can be valuable in a broad variety of applications.