Explainability's Gain is Optimality's Loss? -- How Explanations Bias Decision-making
This addresses a critical issue for organizations using AI in decision-making, highlighting a trade-off between explainability and optimality, with incremental insights into bias mechanisms.
The paper tackles the problem of how feature-based explanations in machine learning can bias human decision-making by inducing confirmation bias and disparate impacts on confidence, leading to sub-optimal outcomes, as demonstrated through a field experiment.
Decisions in organizations are about evaluating alternatives and choosing the one that would best serve organizational goals. To the extent that the evaluation of alternatives could be formulated as a predictive task with appropriate metrics, machine learning algorithms are increasingly being used to improve the efficiency of the process. Explanations help to facilitate communication between the algorithm and the human decision-maker, making it easier for the latter to interpret and make decisions on the basis of predictions by the former. Feature-based explanations' semantics of causal models, however, induce leakage from the decision-maker's prior beliefs. Our findings from a field experiment demonstrate empirically how this leads to confirmation bias and disparate impact on the decision-maker's confidence in the predictions. Such differences can lead to sub-optimal and biased decision outcomes.