WordBias: An Interactive Visual Tool for Discovering Intersectional Biases Encoded in Word Embeddings
This addresses the challenge of discovering biases for researchers and practitioners working on fairness in NLP, though it is incremental as it builds on existing bias detection methods.
The paper tackles the problem of identifying intersectional biases in word embeddings by developing WordBias, an interactive visual tool that computes and visualizes word associations across social groups, demonstrating its use in uncovering biases such as against Black Muslim Males and Poor Females.
Intersectional bias is a bias caused by an overlap of multiple social factors like gender, sexuality, race, disability, religion, etc. A recent study has shown that word embedding models can be laden with biases against intersectional groups like African American females, etc. The first step towards tackling such intersectional biases is to identify them. However, discovering biases against different intersectional groups remains a challenging task. In this work, we present WordBias, an interactive visual tool designed to explore biases against intersectional groups encoded in static word embeddings. Given a pretrained static word embedding, WordBias computes the association of each word along different groups based on race, age, etc. and then visualizes them using a novel interactive interface. Using a case study, we demonstrate how WordBias can help uncover biases against intersectional groups like Black Muslim Males, Poor Females, etc. encoded in word embedding. In addition, we also evaluate our tool using qualitative feedback from expert interviews. The source code for this tool can be publicly accessed for reproducibility at github.com/bhavyaghai/WordBias.