Causal Responsibility Attribution for Human-AI Collaboration
This addresses the need for fair responsibility attribution in human-AI systems, which is crucial for ethical decision-making across various fields, though it appears incremental by building on existing causal methods.
The paper tackles the problem of attributing responsibility for undesirable outcomes in human-AI collaboration by proposing a causal framework using Structural Causal Models, which measures blameworthiness and incorporates counterfactual reasoning to account for agents' expected epistemic levels, as demonstrated in two case studies.
As Artificial Intelligence (AI) systems increasingly influence decision-making across various fields, the need to attribute responsibility for undesirable outcomes has become essential, though complicated by the complex interplay between humans and AI. Existing attribution methods based on actual causality and Shapley values tend to disproportionately blame agents who contribute more to an outcome and rely on real-world measures of blameworthiness that may misalign with responsible AI standards. This paper presents a causal framework using Structural Causal Models (SCMs) to systematically attribute responsibility in human-AI systems, measuring overall blameworthiness while employing counterfactual reasoning to account for agents' expected epistemic levels. Two case studies illustrate the framework's adaptability in diverse human-AI collaboration scenarios.