Identifying the Most Explainable Classifier
This work addresses the need for formal explainability measures in machine learning, though it appears incremental as it builds on existing notions of explanation.
The paper tackles the problem of characterizing the most explainable classifier by introducing pointwise coverage to measure explainability, and proves that the binary linear classifier is uniquely the most explainable up to negligible sets.
We introduce the notion of pointwise coverage to measure the explainability properties of machine learning classifiers. An explanation for a prediction is a definably simple region of the feature space sharing the same label as the prediction, and the coverage of an explanation measures its size or generalizability. With this notion of explanation, we investigate whether or not there is a natural characterization of the most explainable classifier. According with our intuitions, we prove that the binary linear classifier is uniquely the most explainable classifier up to negligible sets.