LGNov 19, 2025

CID: Measuring Feature Importance Through Counterfactual Distributions

arXiv:2511.15371v1h-index: 4
Originality Incremental advance
AI Analysis

This provides a novel tool for model analysis, addressing the need for well-founded feature importance measures in interpretable AI, though it is incremental as it builds on existing local explainers.

The paper tackles the problem of measuring feature importance in machine learning models by introducing Counterfactual Importance Distribution (CID), a post-hoc local method that generates counterfactuals and uses distributional dissimilarity to rank features, resulting in improved faithfulness metrics for explanations.

Assessing the importance of individual features in Machine Learning is critical to understand the model's decision-making process. While numerous methods exist, the lack of a definitive ground truth for comparison highlights the need for alternative, well-founded measures. This paper introduces a novel post-hoc local feature importance method called Counterfactual Importance Distribution (CID). We generate two sets of positive and negative counterfactuals, model their distributions using Kernel Density Estimation, and rank features based on a distributional dissimilarity measure. This measure, grounded in a rigorous mathematical framework, satisfies key properties required to function as a valid metric. We showcase the effectiveness of our method by comparing with well-established local feature importance explainers. Our method not only offers complementary perspectives to existing approaches, but also improves performance on faithfulness metrics (both for comprehensiveness and sufficiency), resulting in more faithful explanations of the system. These results highlight its potential as a valuable tool for model analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes