Appropriate Fairness Perceptions? On the Effectiveness of Explanations in Enabling People to Assess the Fairness of Automated Decision Systems
This work addresses the challenge of ensuring users can correctly judge the fairness of automated systems, which is crucial for accountability and trust in AI applications, though it is incremental as it builds on existing explanation research.
The paper tackles the problem of evaluating explanations for automated decision systems (ADS) by proposing that explanations should enable users to accurately assess system fairness, rather than just increase positive perceptions. The result is a novel study design and the introduction of the desideratum of appropriate fairness perceptions, with next steps outlined for a comprehensive experiment.
It is often argued that one goal of explaining automated decision systems (ADS) is to facilitate positive perceptions (e.g., fairness or trustworthiness) of users towards such systems. This viewpoint, however, makes the implicit assumption that a given ADS is fair and trustworthy, to begin with. If the ADS issues unfair outcomes, then one might expect that explanations regarding the system's workings will reveal its shortcomings and, hence, lead to a decrease in fairness perceptions. Consequently, we suggest that it is more meaningful to evaluate explanations against their effectiveness in enabling people to appropriately assess the quality (e.g., fairness) of an associated ADS. We argue that for an effective explanation, perceptions of fairness should increase if and only if the underlying ADS is fair. In this in-progress work, we introduce the desideratum of appropriate fairness perceptions, propose a novel study design for evaluating it, and outline next steps towards a comprehensive experiment.