Global Model Interpretation via Recursive Partitioning
This work addresses the need for human-understandable interpretations of complex machine learning models, which is crucial for trust and debugging in applications, though it appears incremental as it builds on existing interpretation techniques.
The authors tackled the problem of interpreting black-box machine learning models globally by proposing a method that uses a compact binary tree to represent key decision rules, demonstrating its effectiveness in diagnosing models and aiding knowledge discovery across multiple tasks.
In this work, we propose a simple but effective method to interpret black-box machine learning models globally. That is, we use a compact binary tree, the interpretation tree, to explicitly represent the most important decision rules that are implicitly contained in the black-box machine learning models. This tree is learned from the contribution matrix which consists of the contributions of input variables to predicted scores for each single prediction. To generate the interpretation tree, a unified process recursively partitions the input variable space by maximizing the difference in the average contribution of the split variable between the divided spaces. We demonstrate the effectiveness of our method in diagnosing machine learning models on multiple tasks. Also, it is useful for new knowledge discovery as such insights are not easily identifiable when only looking at single predictions. In general, our work makes it easier and more efficient for human beings to understand machine learning models.