The Duet of Representations and How Explanations Exacerbate It
This addresses the problem of explanation-induced bias in human-AI collaboration for real-world decision-makers like employment counselors, showing an incremental but practical risk.
The study investigated how explanations from AI models can worsen human decision-making when they highlight features that conflict with prior beliefs, leading to causal overattribution. In a field experiment with employment counselors using an XGBoost model, providing SHAP explanations reduced decision quality when conflicting features were displayed.
An algorithm effects a causal representation of relations between features and labels in the human's perception. Such a representation might conflict with the human's prior belief. Explanations can direct the human's attention to the conflicting feature and away from other relevant features. This leads to causal overattribution and may adversely affect the human's information processing. In a field experiment we implemented an XGBoost-trained model as a decision-making aid for counselors at a public employment service to predict candidates' risk of long-term unemployment. The treatment group of counselors was also provided with SHAP. The results show that the quality of the human's decision-making is worse when a feature on which the human holds a conflicting prior belief is displayed as part of the explanation.