MEMLNov 21, 2021

Decorrelated Variable Importance

arXiv:2111.10853v130 citations
Originality Incremental advance
AI Analysis

This work addresses interpretability in black-box prediction methods for researchers and practitioners, but it is incremental as it builds on existing LOCO approaches.

The paper tackles the problem of variable importance measures being confounded by correlations between covariates, proposing a modified LOCO parameter to mitigate this effect, and shows how to estimate it using semiparametric models.

Because of the widespread use of black box prediction methods such as random forests and neural nets, there is renewed interest in developing methods for quantifying variable importance as part of the broader goal of interpretable prediction. A popular approach is to define a variable importance parameter - known as LOCO (Leave Out COvariates) - based on dropping covariates from a regression model. This is essentially a nonparametric version of R-squared. This parameter is very general and can be estimated nonparametrically, but it can be hard to interpret because it is affected by correlation between covariates. We propose a method for mitigating the effect of correlation by defining a modified version of LOCO. This new parameter is difficult to estimate nonparametrically, but we show how to estimate it using semiparametric models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes