Methods and Models for Interpretable Linear Classification
This work addresses the need for practitioners to create tailored, interpretable models in domains like healthcare, though it is incremental as it builds on existing integer programming and interpretability methods.
The authors tackled the problem of building accurate and interpretable linear classification models by introducing an integer programming framework that minimizes 0-1 classification loss and incorporates discrete constraints, achieving training accuracy comparable to other linear classifiers and demonstrating its utility in a clinical case study for sleep apnea diagnosis.
We present an integer programming framework to build accurate and interpretable discrete linear classification models. Unlike existing approaches, our framework is designed to provide practitioners with the control and flexibility they need to tailor accurate and interpretable models for a domain of choice. To this end, our framework can produce models that are fully optimized for accuracy, by minimizing the 0--1 classification loss, and that address multiple aspects of interpretability, by incorporating a range of discrete constraints and penalty functions. We use our framework to produce models that are difficult to create with existing methods, such as scoring systems and M-of-N rule tables. In addition, we propose specially designed optimization methods to improve the scalability of our framework through decomposition and data reduction. We show that discrete linear classifiers can attain the training accuracy of any other linear classifier, and provide an Occam's Razor type argument as to why the use of small discrete coefficients can provide better generalization. We demonstrate the performance and flexibility of our framework through numerical experiments and a case study in which we construct a highly tailored clinical tool for sleep apnea diagnosis.