MLITLGJun 25, 2018

Why Interpretability in Machine Learning? An Answer Using Distributed Detection and Data Fusion Theory

arXiv:1806.09710v17 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of balancing interpretability and accuracy for users of ML systems, offering a theoretical framework that is incremental in applying distributed detection theory to this domain.

The paper tackles the trade-off between interpretability and accuracy in machine learning by modeling the decision-making system as a human-machine tandem, proving that interpretable classifiers can outperform black-box ones in overall system performance.

As artificial intelligence is increasingly affecting all parts of society and life, there is growing recognition that human interpretability of machine learning models is important. It is often argued that accuracy or other similar generalization performance metrics must be sacrificed in order to gain interpretability. Such arguments, however, fail to acknowledge that the overall decision-making system is composed of two entities: the learned model and a human who fuses together model outputs with his or her own information. As such, the relevant performance criteria should be for the entire system, not just for the machine learning component. In this work, we characterize the performance of such two-node tandem data fusion systems using the theory of distributed detection. In doing so, we work in the population setting and model interpretable learned models as multi-level quantizers. We prove that under our abstraction, the overall system of a human with an interpretable classifier outperforms one with a black box classifier.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes