Greedy PIG: Adaptive Integrated Gradients
This work addresses interpretability in deep learning, an incremental advance in making attribution methods more versatile for practitioners.
The paper tackled the challenge of interpreting deep learning model predictions by proposing Greedy PIG, an adaptive generalization of path integrated gradients for feature attribution and selection, demonstrating its success across tasks like image attribution and graph explanation with improved performance.
Deep learning has become the standard approach for most machine learning tasks. While its impact is undeniable, interpreting the predictions of deep learning models from a human perspective remains a challenge. In contrast to model training, model interpretability is harder to quantify and pose as an explicit optimization problem. Inspired by the AUC softmax information curve (AUC SIC) metric for evaluating feature attribution methods, we propose a unified discrete optimization framework for feature attribution and feature selection based on subset selection. This leads to a natural adaptive generalization of the path integrated gradients (PIG) method for feature attribution, which we call Greedy PIG. We demonstrate the success of Greedy PIG on a wide variety of tasks, including image feature attribution, graph compression/explanation, and post-hoc feature selection on tabular data. Our results show that introducing adaptivity is a powerful and versatile method for making attribution methods more powerful.