From Confusion to Clarity: ProtoScore -- A Framework for Evaluating Prototype-Based XAI
This work addresses a critical gap for researchers and practitioners in XAI by providing a tool to objectively assess prototype-based explanation methods, though it is incremental as it builds on existing properties without introducing a new method.
The paper tackles the lack of standardized benchmarks for evaluating prototype-based explainable AI (XAI) methods, particularly for time series data, by introducing ProtoScore, a framework that integrates Co-12 properties to enable fair comparisons and reduce reliance on user studies.
The complexity and opacity of neural networks (NNs) pose significant challenges, particularly in high-stakes fields such as healthcare, finance, and law, where understanding decision-making processes is crucial. To address these issues, the field of explainable artificial intelligence (XAI) has developed various methods aimed at clarifying AI decision-making, thereby facilitating appropriate trust and validating the fairness of outcomes. Among these methods, prototype-based explanations offer a promising approach that uses representative examples to elucidate model behavior. However, a critical gap exists regarding standardized benchmarks to objectively compare prototype-based XAI methods, especially in the context of time series data. This lack of reliable benchmarks results in subjective evaluations, hindering progress in the field. We aim to establish a robust framework, ProtoScore, for assessing prototype-based XAI methods across different data types with a focus on time series data, facilitating fair and comprehensive evaluations. By integrating the Co-12 properties of Nauta et al., this framework allows for effectively comparing prototype methods against each other and against other XAI methods, ultimately assisting practitioners in selecting appropriate explanation methods while minimizing the costs associated with user studies. All code is publicly available at https://github.com/HelenaM23/ProtoScore .