LGAIDec 13, 2020

MEME: Generating RNN Model Explanations via Model Extraction

arXiv:2012.06954v116 citations
AI Analysis

This work addresses the problem of improving the explainability and interpretability of RNNs for researchers and practitioners working with sequential data.

This paper introduces MEME, a model extraction approach that approximates Recurrent Neural Networks (RNNs) with interpretable models. The extracted models, represented by human-understandable concepts and their interactions, can be used to interpret RNN decision-making both locally and globally.

Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks. A key step to further empowering RNN-based approaches is improving their explainability and interpretability. In this work we present MEME: a model extraction approach capable of approximating RNNs with interpretable models represented by human-understandable concepts and their interactions. We demonstrate how MEME can be applied to two multivariate, continuous data case studies: Room Occupation Prediction, and In-Hospital Mortality Prediction. Using these case-studies, we show how our extracted models can be used to interpret RNNs both locally and globally, by approximating RNN decision-making via interpretable concept interactions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes