LGMLOct 1, 2019

Randomized Ablation Feature Importance

arXiv:1910.00174v214 citations
Originality Incremental advance
AI Analysis

This work addresses the need for reliable feature importance assessment in machine learning, offering a method with uncertainty quantification, but it appears incremental as it builds on existing ablation-based approaches.

The paper tackles the problem of measuring feature importance in predictive models by proposing a method that ablates features through random replacement and quantifies the resulting change in prediction loss, along with providing statistical uncertainty measures for these importance estimates.

Given a model $f$ that predicts a target $y$ from a vector of input features $\pmb{x} = x_1, x_2, \ldots, x_M$, we seek to measure the importance of each feature with respect to the model's ability to make a good prediction. To this end, we consider how (on average) some measure of goodness or badness of prediction (which we term "loss" $\ell$), changes when we hide or ablate each feature from the model. To ablate a feature, we replace its value with another possible value randomly. By averaging over many points and many possible replacements, we measure the importance of a feature on the model's ability to make good predictions. Furthermore, we present statistical measures of uncertainty that quantify how confident we are that the feature importance we measure from our finite dataset and finite number of ablations is close to the theoretical true importance value.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes