On Explaining Unfairness: An Overview
This work addresses the problem of integrating fairness and explainability in AI for researchers and practitioners, but it is incremental as it primarily organizes existing knowledge.
The paper tackles the interplay between algorithmic fairness and explainability by presenting taxonomies for both fields and categorizing explanations for fairness into three types to enhance metrics, understand causes, and design mitigation methods, with the result being a framework that identifies gaps for future research.
Algorithmic fairness and explainability are foundational elements for achieving responsible AI. In this paper, we focus on their interplay, a research area that is recently receiving increasing attention. To this end, we first present two comprehensive taxonomies, each representing one of the two complementary fields of study: fairness and explanations. Then, we categorize explanations for fairness into three types: (a) Explanations to enhance fairness metrics, (b) Explanations to help us understand the causes of (un)fairness, and (c) Explanations to assist us in designing methods for mitigating unfairness. Finally, based on our fairness and explanation taxonomies, we present undiscovered literature paths revealing gaps that can serve as valuable insights for future research.