CLAIFeb 17, 2025

Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models

arXiv:2502.11559v31 citationsh-index: 4Has Code
Originality Incremental advance
AI Analysis

This addresses gender bias mitigation in LLMs, offering a flexible and model-independent solution, though it appears incremental as it builds on existing instruction-based methods.

The authors tackled the problem of gender bias in large language models by proposing FaIRMaker, an automated framework that generates and refines Fairwords as instructions to reduce bias without compromising task performance, achieving effective mitigation across both API-based and open-source models.

Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias. While parameter-modification methods like fine-tuning mitigate bias, they are resource-intensive, unsuitable for closed-source models, and lack adaptability to evolving societal norms. Instruction-based approaches offer flexibility but often compromise task performance. To address these limitations, we propose $\textbf{FaIRMaker}$, an automated and model-independent framework that employs an $\textbf{auto-search and refinement}$ paradigm to adaptively generate Fairwords, which act as instructions integrated into input queries to reduce gender bias and enhance response quality. Extensive experiments demonstrate that FaIRMaker automatically searches for and dynamically refines Fairwords, effectively mitigating gender bias while preserving task integrity and ensuring compatibility with both API-based and open-source LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes