CLApr 19, 2025

Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification

arXiv:2504.14212v33 citationsh-index: 8EMNLP
Originality Synthesis-oriented
AI Analysis

This addresses bias mitigation in LLMs for AI fairness applications, though it appears incremental as it builds on existing bias analysis methods.

The authors tackled the problem of social biases in large language models by developing an annotation pipeline to analyze biases in pretraining corpora, demonstrating its effectiveness on Common Crawl data.

Large language models (LLMs) acquire general linguistic knowledge from massive-scale pretraining. However, pretraining data mainly comprised of web-crawled texts contain undesirable social biases which can be perpetuated or even amplified by LLMs. In this study, we propose an efficient yet effective annotation pipeline to investigate social biases in the pretraining corpora. Our pipeline consists of protected attribute detection to identify diverse demographics, followed by regard classification to analyze the language polarity towards each attribute. Through our experiments, we demonstrate the effect of our bias analysis and mitigation measures, focusing on Common Crawl as the most representative pretraining corpus.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes