VirtualXAI: A User-Centric Framework for Explainability Assessment Leveraging GPT-Generated Personas
This addresses the problem for practitioners in AI who need better guidance in selecting datasets, models, and XAI methods, though it appears incremental as it builds on existing evaluation concepts.
The paper tackles the challenge of evaluating explainable AI (XAI) methods by proposing a framework that combines quantitative benchmarking with qualitative user assessments using GPT-generated personas, and it includes a recommender system to provide tailored AI model and XAI method recommendations with an estimated XAI score.
In today's data-driven era, computational systems generate vast amounts of data that drive the digital transformation of industries, where Artificial Intelligence (AI) plays a key role. Currently, the demand for eXplainable AI (XAI) has increased to enhance the interpretability, transparency, and trustworthiness of AI models. However, evaluating XAI methods remains challenging: existing evaluation frameworks typically focus on quantitative properties such as fidelity, consistency, and stability without taking into account qualitative characteristics such as satisfaction and interpretability. In addition, practitioners face a lack of guidance in selecting appropriate datasets, AI models, and XAI methods -a major hurdle in human-AI collaboration. To address these gaps, we propose a framework that integrates quantitative benchmarking with qualitative user assessments through virtual personas based on the "Anthology" of backstories of the Large Language Model (LLM). Our framework also incorporates a content-based recommender system that leverages dataset-specific characteristics to match new input data with a repository of benchmarked datasets. This yields an estimated XAI score and provides tailored recommendations for both the optimal AI model and the XAI method for a given scenario.