CY HC LGJun 12, 2023

Towards Fair and Explainable AI using a Human-Centered AI Approach

arXiv:2306.07427v13.34 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

It addresses fairness and explainability issues for users and developers of AI systems, but is incremental as it builds on existing human-centered approaches with specific tools and empirical studies.

This dissertation tackles the problem of fairness and explainability in AI by developing human-centered tools and studies, resulting in enhanced trust and bias mitigation across classification systems and word embeddings, with findings like explanations supporting trust calibration and tools identifying biases in datasets and creative writing.

The rise of machine learning (ML) is accompanied by several high-profile cases that have stressed the need for fairness, accountability, explainability and trust in ML systems. The existing literature has largely focused on fully automated ML approaches that try to optimize for some performance metric. However, human-centric measures like fairness, trust, explainability, etc. are subjective in nature, context-dependent, and might not correlate with conventional performance metrics. To deal with these challenges, we explore a human-centered AI approach that empowers people by providing more transparency and human control. In this dissertation, we present 5 research projects that aim to enhance explainability and fairness in classification systems and word embeddings. The first project explores the utility/downsides of introducing local model explanations as interfaces for machine teachers (crowd workers). Our study found that adding explanations supports trust calibration for the resulting ML model and enables rich forms of teaching feedback. The second project presents D-BIAS, a causality-based human-in-the-loop visual tool for identifying and mitigating social biases in tabular datasets. Apart from fairness, we found that our tool also enhances trust and accountability. The third project presents WordBias, a visual interactive tool that helps audit pre-trained static word embeddings for biases against groups, such as females, or subgroups, such as Black Muslim females. The fourth project presents DramatVis Personae, a visual analytics tool that helps identify social biases in creative writing. Finally, the last project presents an empirical study aimed at understanding the cumulative impact of multiple fairness-enhancing interventions at different stages of the ML pipeline on fairness, utility and different population groups. We conclude by discussing some of the future directions.

View on arXiv PDF Code

Similar