Variable Importance Clouds: A Way to Explore Variable Importance for the Set of Good Models
This addresses the issue for researchers and practitioners in fields like social sciences, healthcare, and causal inference who rely on variable importance for model interpretation, but it is incremental as it builds on existing variable importance concepts.
The paper tackles the problem that variable importance is often tied to a single predictive model, which can be misleading when multiple models perform equally well, by introducing variable importance clouds to explore importance across all approximately-equally-accurate models. It demonstrates through experiments on criminal justice, marketing, and image classification data that variable importance can vary significantly among these models.
Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare, and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is problematic: what if there were multiple well-performing predictive models, and a specific variable is important to some of them and not to others? In that case, we may not be able to tell from a single well-performing model whether a variable is always important in predicting the outcome. Rather than depending on variable importance for a single predictive model, we would like to explore variable importance for all approximately-equally-accurate predictive models. This work introduces the concept of a variable importance cloud, which maps every variable to its importance for every good predictive model. We show properties of the variable importance cloud and draw connections to other areas of statistics. We introduce variable importance diagrams as a projection of the variable importance cloud into two dimensions for visualization purposes. Experiments with criminal justice, marketing data, and image classification tasks illustrate how variables can change dramatically in importance for approximately-equally-accurate predictive models