LG MLJul 8, 2019

The Price of Interpretability

Dimitris Bertsimas, Arthur Delarue, Patrick Jaillet, Sebastien Martin

arXiv:1907.03419v113.139 citations

Originality Incremental advance

AI Analysis

This work addresses the need for formal interpretability measures in decision-making applications, though it is incremental in extending existing proxies.

The paper tackles the loosely defined concept of interpretability in machine learning by introducing a mathematical framework to construct models through interpretable steps, quantifying the tradeoff between interpretability and predictive accuracy as the 'price' of interpretability.

When quantitative models are used to support decision-making on complex and important topics, understanding a model's ``reasoning'' can increase trust in its predictions, expose hidden biases, or reduce vulnerability to adversarial attacks. However, the concept of interpretability remains loosely defined and application-specific. In this paper, we introduce a mathematical framework in which machine learning models are constructed in a sequence of interpretable steps. We show that for a variety of models, a natural choice of interpretable steps recovers standard interpretability proxies (e.g., sparsity in linear models). We then generalize these proxies to yield a parametrized family of consistent measures of model interpretability. This formal definition allows us to quantify the ``price'' of interpretability, i.e., the tradeoff with predictive accuracy. We demonstrate practical algorithms to apply our framework on real and synthetic datasets.

View on arXiv PDF

Similar