ABE: A Unified Framework for Robust and Faithful Attribution-Based Explainability
This work addresses the need for more transparent and interoperable AI systems by improving attribution-based explainability for researchers and developers, though it appears incremental as it builds on existing methods within a new framework.
The authors tackled the problem of existing attribution frameworks suffering from scalability, high coupling, theoretical constraints, and lack of user-friendliness, by proposing ABE, a unified framework that formalizes attribution methods and integrates state-of-the-art algorithms while ensuring compliance with axioms, providing a scalable and extensible foundation for advancing explainability.
Attribution algorithms are essential for enhancing the interpretability and trustworthiness of deep learning models by identifying key features driving model decisions. Existing frameworks, such as InterpretDL and OmniXAI, integrate multiple attribution methods but suffer from scalability limitations, high coupling, theoretical constraints, and lack of user-friendly implementations, hindering neural network transparency and interoperability. To address these challenges, we propose Attribution-Based Explainability (ABE), a unified framework that formalizes Fundamental Attribution Methods and integrates state-of-the-art attribution algorithms while ensuring compliance with attribution axioms. ABE enables researchers to develop novel attribution techniques and enhances interpretability through four customizable modules: Robustness, Interpretability, Validation, and Data & Model. This framework provides a scalable, extensible foundation for advancing attribution-based explainability and fostering transparent AI systems. Our code is available at: https://github.com/LMBTough/ABE-XAI.