Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution
This work addresses the need for interpretable feature importance analysis in unconditional distributions for applications relying on black-box models, representing an incremental improvement.
The authors tackled the problem of assessing feature importance in unconditional distributions using black-box models, developing an approximation method that produces sparse, faithful results and is computationally efficient.
Understanding how changes in explanatory features affect the unconditional distribution of the outcome is important in many applications. However, existing black-box predictive models are not readily suited for analyzing such questions. In this work, we develop an approximation method to compute the feature importance curves relevant to the unconditional distribution of outcomes, while leveraging the power of pre-trained black-box predictive models. The feature importance curves measure the changes across quantiles of outcome distribution given an external impact of change in the explanatory features. Through extensive numerical experiments and real data examples, we demonstrate that our approximation method produces sparse and faithful results, and is computationally efficient.