CL AIAug 3, 2023

NBIAS: A Natural Language Processing Framework for Bias Identification in Text

Shaina Raza, Muskan Garg, Deepak John Reji, Syed Raza Bashir, Chen Ding

arXiv:2308.01681v38.178 citationsh-index: 13

Originality Incremental advance

AI Analysis

This addresses the problem of unfair outcomes in AI systems for users affected by biased data, but it is incremental as it builds on existing methods for bias detection.

The paper tackles the problem of bias in textual data by developing the NBIAS framework, which uses a transformer-based model to identify bias words/phrases and achieves accuracy improvements of 1% to 8% compared to baselines.

Bias in textual data can lead to skewed interpretations and outcomes when the data is used. These biases could perpetuate stereotypes, discrimination, or other forms of unfair treatment. An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people. Therefore, it is crucial to detect and remove these biases to ensure the fair and ethical use of data. To this end, we develop a comprehensive and robust framework NBIAS that consists of four main layers: data, corpus construction, model development and an evaluation layer. The dataset is constructed by collecting diverse data from various domains, including social media, healthcare, and job hiring portals. As such, we applied a transformer-based token classification model that is able to identify bias words/ phrases through a unique named entity BIAS. In the evaluation procedure, we incorporate a blend of quantitative and qualitative measures to gauge the effectiveness of our models. We achieve accuracy improvements ranging from 1% to 8% compared to baselines. We are also able to generate a robust understanding of the model functioning. The proposed approach is applicable to a variety of biases and contributes to the fair and ethical use of textual data.

View on arXiv PDF

Similar