MLOCMEOct 1, 2016

Learning Optimized Risk Scores

arXiv:1610.00168v5107 citations
Originality Highly original
AI Analysis

This addresses the need for calibrated, sparse, and constraint-obeying risk scores in domains like medicine and criminal justice, offering a more efficient and reliable method compared to heuristic approaches.

The paper tackles the problem of learning risk scores from data by formulating it as a mixed integer nonlinear program and using a cutting plane algorithm to efficiently find optimal solutions, achieving scalability and certificates of optimality without parameter tuning.

Risk scores are simple classification models that let users make quick risk predictions by adding and subtracting a few small numbers. These models are widely used in medicine and criminal justice, but are difficult to learn from data because they need to be calibrated, sparse, use small integer coefficients, and obey application-specific operational constraints. In this paper, we present a new machine learning approach to learn risk scores. We formulate the risk score problem as a mixed integer nonlinear program, and present a cutting plane algorithm for non-convex settings to efficiently recover its optimal solution. We improve our algorithm with specialized techniques to generate feasible solutions, narrow the optimality gap, and reduce data-related computation. Our approach can fit risk scores in a way that scales linearly in the number of samples, provides a certificate of optimality, and obeys real-world constraints without parameter tuning or post-processing. We benchmark the performance benefits of this approach through an extensive set of numerical experiments, comparing to risk scores built using heuristic approaches. We also discuss its practical benefits through a real-world application where we build a customized risk score for ICU seizure prediction in collaboration with the Massachusetts General Hospital.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes