CXPlain: Causal Explanations for Model Interpretation under Uncertainty
This addresses the problem of interpreting and validating machine-learning models for users by offering improved feature importance estimates, though it is incremental as it builds on existing explanation methods.
The authors tackled the challenge of providing fast, accurate feature importance estimates with uncertainty quantification for machine-learning models by framing explanation as a causal learning task, resulting in CXPlain, which demonstrated significantly higher accuracy and speed than existing model-agnostic methods in experiments.
Feature importance estimates that inform users about the degree to which given inputs influence the output of a predictive model are crucial for understanding, validating, and interpreting machine-learning models. However, providing fast and accurate estimates of feature importance for high-dimensional data, and quantifying the uncertainty of such estimates remain open challenges. Here, we frame the task of providing explanations for the decisions of machine-learning models as a causal learning task, and train causal explanation (CXPlain) models that learn to estimate to what degree certain inputs cause outputs in another machine-learning model. CXPlain can, once trained, be used to explain the target model in little time, and enables the quantification of the uncertainty associated with its feature importance estimates via bootstrap ensembling. We present experiments that demonstrate that CXPlain is significantly more accurate and faster than existing model-agnostic methods for estimating feature importance. In addition, we confirm that the uncertainty estimates provided by CXPlain ensembles are strongly correlated with their ability to accurately estimate feature importance on held-out data.