CYAILGJun 14, 2021

Can Explainable AI Explain Unfairness? A Framework for Evaluating Explainable AI

arXiv:2106.07483v145 citations
Originality Incremental advance
AI Analysis

This addresses the problem of ensuring AI fairness for developers and users by providing a practical evaluation framework, though it is incremental as it builds on existing critiques of XAI tools.

The paper tackles the problem of explainable AI (XAI) tools potentially misleading users about model fairness, known as 'fairwashing', by creating a framework to evaluate these tools for bias detection and communication. The result shows that many prominent XAI tools lack critical features for detecting bias, and the framework can guide developers in making modifications to reduce fairwashing issues.

Many ML models are opaque to humans, producing decisions too complex for humans to easily understand. In response, explainable artificial intelligence (XAI) tools that analyze the inner workings of a model have been created. Despite these tools' strength in translating model behavior, critiques have raised concerns about the impact of XAI tools as a tool for `fairwashing` by misleading users into trusting biased or incorrect models. In this paper, we created a framework for evaluating explainable AI tools with respect to their capabilities for detecting and addressing issues of bias and fairness as well as their capacity to communicate these results to their users clearly. We found that despite their capabilities in simplifying and explaining model behavior, many prominent XAI tools lack features that could be critical in detecting bias. Developers can use our framework to suggest modifications needed in their toolkits to reduce issues likes fairwashing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes