Simulating a Bias Mitigation Scenario in Large Language Models
It addresses fairness and trust issues in NLP for users and developers, but is incremental as it builds on existing knowledge with new simulations.
This paper tackles the problem of biases in Large Language Models by analyzing their sources and implementing a simulation framework to evaluate mitigation strategies, resulting in empirical validation of approaches like data curation and debiasing.
Large Language Models (LLMs) have fundamentally transformed the field of natural language processing; however, their vulnerability to biases presents a notable obstacle that threatens both fairness and trust. This review offers an extensive analysis of the bias landscape in LLMs, tracing its roots and expressions across various NLP tasks. Biases are classified into implicit and explicit types, with particular attention given to their emergence from data sources, architectural designs, and contextual deployments. This study advances beyond theoretical analysis by implementing a simulation framework designed to evaluate bias mitigation strategies in practice. The framework integrates multiple approaches including data curation, debiasing during model training, and post-hoc output calibration and assesses their impact in controlled experimental settings. In summary, this work not only synthesizes existing knowledge on bias in LLMs but also contributes original empirical validation through simulation of mitigation strategies.