LGJun 12, 2022

A Functional Information Perspective on Model Interpretation

Itai Gat, Nitay Calderon, Roi Reichart, Tamir Hazan

arXiv:2206.05700v27.86 citationsh-index: 43Has Code

Originality Highly original

AI Analysis

This work addresses the interpretability challenge in deep learning for researchers and practitioners, offering a principled approach to feature attribution.

The authors tackled the problem of interpreting complex predictive models by proposing a theoretical framework that measures feature contributions to functional entropy, and demonstrated that their method outperforms existing interpretability techniques across image, text, and audio data.

Contemporary predictive models are hard to interpret as their deep nets exploit numerous complex relations between input elements. This work suggests a theoretical framework for model interpretability by measuring the contribution of relevant features to the functional entropy of the network with respect to the input. We rely on the log-Sobolev inequality that bounds the functional entropy by the functional Fisher information with respect to the covariance of the data. This provides a principled way to measure the amount of information contribution of a subset of features to the decision function. Through extensive experiments, we show that our method surpasses existing interpretability sampling-based methods on various data signals such as image, text, and audio.

View on arXiv PDF Code

Similar