Value bounds and Convergence Analysis for Averages of LRP attributions
This work provides theoretical insights for scenarios involving multiple non-geometric data augmentations and Smoothgrad-type methods, offering incremental analysis specific to LRP attribution techniques.
The paper tackles the problem of understanding the numerical properties and convergence behavior of Layer-wise Relevance Propagation (LRP) attribution methods by deriving bounds on singular values and component-wise values, leading to multiplicative constants that govern the convergence of empirical means to expectations, with constants for LRP-beta shown to be independent of weight norms.
We analyze numerical properties of Layer-wise relevance propagation (LRP)-type attribution methods by representing them as a product of modified gradient matrices. This representation creates an analogy to matrix multiplications of Jacobi-matrices which arise from the chain rule of differentiation. In order to shed light on the distribution of attribution values, we derive upper bounds for singular values. Furthermore we derive component-wise bounds for attribution map values. As a main result, we apply these component-wise bounds to obtain multiplicative constants. These constants govern the convergence of empirical means of attributions to expectations of attribution maps. This finding has important implications for scenarios where multiple non-geometric data augmentations are applied to individual test samples, as well as for Smoothgrad-type attribution methods. In particular, our analysis reveals that the constants for LRP-beta remain independent of weight norms, a significant distinction from both gradient-based methods and LRP-epsilon.