Ensemble Interpretation: A Unified Method for Interpretable Machine Learning
This work addresses interpretability issues for machine learning practitioners, offering an incremental improvement by combining existing methods.
The paper tackles the problem of stability and fidelity in interpretable machine learning by proposing ensemble interpretation, a method that integrates multiple explanation perspectives, resulting in more stable explanations and improved generalization performance in feature selection applications.
To address the issues of stability and fidelity in interpretable learning, a novel interpretable methodology, ensemble interpretation, is presented in this paper which integrates multi-perspective explanation of various interpretation methods. On one hand, we define a unified paradigm to describe the common mechanism of different interpretation methods, and then integrate the multiple interpretation results to achieve more stable explanation. On the other hand, a supervised evaluation method based on prior knowledge is proposed to evaluate the explaining performance of an interpretation method. The experiment results show that the ensemble interpretation is more stable and more consistent with human experience and cognition. As an application, we use the ensemble interpretation for feature selection, and then the generalization performance of the corresponding learning model is significantly improved.