LGJun 12, 2022

A Functional Information Perspective on Model Interpretation

arXiv:2206.05700v26 citationsh-index: 43
Originality Highly original
AI Analysis

This work addresses the interpretability challenge in deep learning for researchers and practitioners, offering a principled approach to feature attribution.

The authors tackled the problem of interpreting complex predictive models by proposing a theoretical framework that measures feature contributions to functional entropy, and demonstrated that their method outperforms existing interpretability techniques across image, text, and audio data.

Contemporary predictive models are hard to interpret as their deep nets exploit numerous complex relations between input elements. This work suggests a theoretical framework for model interpretability by measuring the contribution of relevant features to the functional entropy of the network with respect to the input. We rely on the log-Sobolev inequality that bounds the functional entropy by the functional Fisher information with respect to the covariance of the data. This provides a principled way to measure the amount of information contribution of a subset of features to the decision function. Through extensive experiments, we show that our method surpasses existing interpretability sampling-based methods on various data signals such as image, text, and audio.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes