MLLGNov 26, 2021

Using Shapley Values and Variational Autoencoders to Explain Predictive Models with Dependent Mixed Features

arXiv:2111.13507v225 citations
Originality Highly original
AI Analysis

This work addresses the challenge of providing reliable explanations for complex machine learning models when features are dependent, which is crucial for enhancing interpretability and trust in AI systems, though it is incremental as it builds on existing Shapley value and VAE methods.

The paper tackled the problem of accurately estimating Shapley values for explaining predictive models with dependent mixed features by using a variational autoencoder with arbitrary conditioning (VAEAC) to model feature dependencies, and demonstrated through simulation studies that this approach outperforms state-of-the-art methods across various settings, including high-dimensional scenarios with a non-uniform masking scheme.

Shapley values are today extensively used as a model-agnostic explanation framework to explain complex predictive machine learning models. Shapley values have desirable theoretical properties and a sound mathematical foundation in the field of cooperative game theory. Precise Shapley value estimates for dependent data rely on accurate modeling of the dependencies between all feature combinations. In this paper, we use a variational autoencoder with arbitrary conditioning (VAEAC) to model all feature dependencies simultaneously. We demonstrate through comprehensive simulation studies that our VAEAC approach to Shapley value estimation outperforms the state-of-the-art methods for a wide range of settings for both continuous and mixed dependent features. For high-dimensional settings, our VAEAC approach with a non-uniform masking scheme significantly outperforms competing methods. Finally, we apply our VAEAC approach to estimate Shapley value explanations for the Abalone data set from the UCI Machine Learning Repository.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes