Explaining a prediction in some nonlinear models
This work addresses interpretability in machine learning, particularly for deep neural networks, but appears incremental as it combines existing methods.
The paper tackles the problem of explaining predictions in nonlinear models by computing input contributions, merging integrated gradient and deep Taylor decomposition methods. It claims to offer a natural choice of reference point compared to DeepLIFT and Deep SHAP.
In this article we will analyse how to compute the contribution of each input value to its aggregate output in some nonlinear models. Regression and classification applications, together with related algorithms for deep neural networks are presented. The proposed approach merges two methods currently present in the literature: integrated gradient and deep Taylor decomposition. Compared to DeepLIFT and Deep SHAP, it provides a natural choice of the reference point peculiar to the model at use.