Confident Teacher, Confident Student? A Novel User Study Design for Investigating the Didactic Potential of Explanations and their Impact on Uncertainty
This work addresses the challenge of assessing XAI's effectiveness in human-AI collaboration for complex visual tasks, but it is incremental as it builds on existing debates about evaluation methods without introducing a new paradigm.
The study tackled the problem of evaluating explanations in XAI by conducting a user study with 1200 participants on a visual annotation task, finding that AI assistance improved accuracy and reduced uncertainty, but explanations did not significantly boost accuracy compared to predictions alone and led to over-reliance on incorrect model outputs without lasting learning effects.
Evaluating the quality of explanations in Explainable Artificial Intelligence (XAI) is to this day a challenging problem, with ongoing debate in the research community. While some advocate for establishing standardized offline metrics, others emphasize the importance of human-in-the-loop (HIL) evaluation. Here we propose an experimental design to evaluate the potential of XAI in human-AI collaborative settings as well as the potential of XAI for didactics. In a user study with 1200 participants we investigate the impact of explanations on human performance on a challenging visual task - annotation of biological species in complex taxonomies. Our results demonstrate the potential of XAI in complex visual annotation tasks: users become more accurate in their annotations and demonstrate less uncertainty with AI assistance. The increase in accuracy was, however, not significantly different when users were shown the mere prediction of the model compared to when also providing an explanation. We also find negative effects of explanations: users tend to replicate the model's predictions more often when shown explanations, even when those predictions are wrong. When evaluating the didactic effects of explanations in collaborative human-AI settings, we find that users' annotations are not significantly better after performing annotation with AI assistance. This suggests that explanations in visual human-AI collaboration do not appear to induce lasting learning effects. All code and experimental data can be found in our GitHub repository: https://github.com/TeodorChiaburu/beexplainable.