Integrating White and Black Box Techniques for Interpretable Machine Learning
This addresses the problem of balancing accuracy and transparency for users of ML systems, but it appears incremental as it combines existing techniques.
The paper tackled the trade-off between interpretability and performance in machine learning by proposing an ensemble classifier that uses a white box model for easier inputs and a black box model for harder ones, but no concrete results or numbers were reported.
In machine learning algorithm design, there exists a trade-off between the interpretability and performance of the algorithm. In general, algorithms which are simpler and easier for humans to comprehend tend to show worse performance than more complex, less transparent algorithms. For example, a random forest classifier is likely to be more accurate than a simple decision tree, but at the expense of interpretability. In this paper, we present an ensemble classifier design which classifies easier inputs using a highly-interpretable classifier (i.e., white box model), and more difficult inputs using a more powerful, but less interpretable classifier (i.e., black box model).