Symmetry-Preserving Paths in Integrated Gradients
This work provides theoretical foundations for interpretability in deep learning, addressing the need for reliable attribution methods, though it is incremental as it builds on existing IG theory.
The paper tackled the problem of verifying and proving that the Integrated Gradients attribution method satisfies completeness and symmetry-preserving properties, and it established the uniqueness of IG as a symmetry-preserving path method.
We provide rigorous proofs that the Integrated Gradients (IG) attribution method for deep networks satisfies completeness and symmetry-preserving properties. We also study the uniqueness of IG as a path method preserving symmetry.