LG MLJul 20, 2020

Towards Ground Truth Explainability on Tabular Data

Brian Barr, Ke Xu, Claudio Silva, Enrico Bertini, Robert Reilly, C. Bayan Bruss, Jason D. Wittenbach

arXiv:2007.10532v19.610 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This provides a tool for data scientists to test and understand explainability methods on tabular data, but it is incremental as it adapts an existing approach from image data to a new domain.

The paper tackles the lack of ground truth for explanations in post hoc explainability on tabular data by proposing a method using copulas to create synthetic datasets with controlled statistical properties, enabling users to build intuition through three demonstrated use cases.

In data science, there is a long history of using synthetic data for method development, feature selection and feature engineering. Our current interest in synthetic data comes from recent work in explainability. Today's datasets are typically larger and more complex - requiring less interpretable models. In the setting of \textit{post hoc} explainability, there is no ground truth for explanations. Inspired by recent work in explaining image classifiers that does provide ground truth, we propose a similar solution for tabular data. Using copulas, a concise specification of the desired statistical properties of a dataset, users can build intuition around explainability using controlled data sets and experimentation. The current capabilities are demonstrated on three use cases: one dimensional logistic regression, impact of correlation from informative features, impact of correlation from redundant variables.

View on arXiv PDF Code

Similar