LG AINov 18, 2021

Covered Information Disentanglement: Model Transparency via Unbiased Permutation Importance

João Pereira, Erik S. G. Stroes, Aeilko H. Zwinderman, Evgeni Levin

arXiv:2111.09744v23.117 citations

Originality Incremental advance

AI Analysis

This addresses model transparency for domains like medicine where understanding feature importance is critical, though it is an incremental improvement over existing permutation importance methods.

The paper tackled the problem of permutation importance undervaluing features due to covariates by proposing Covered Information Disentanglement (CID) to correct for information overlap, demonstrating efficacy on a toy dataset and real-world medical data.

Model transparency is a prerequisite in many domains and an increasingly popular area in machine learning research. In the medical domain, for instance, unveiling the mechanisms behind a disease often has higher priority than the diagnostic itself since it might dictate or guide potential treatments and research directions. One of the most popular approaches to explain model global predictions is the permutation importance where the performance on permuted data is benchmarked against the baseline. However, this method and other related approaches will undervalue the importance of a feature in the presence of covariates since these cover part of its provided information. To address this issue, we propose Covered Information Disentanglement (CID), a method that considers all feature information overlap to correct the values provided by permutation importance. We further show how to compute CID efficiently when coupled with Markov random fields. We demonstrate its efficacy in adjusting permutation importance first on a controlled toy dataset and discuss its effect on real-world medical data.

View on arXiv PDF

Similar