Evaluating Large Language Models through Gender and Racial Stereotypes
This addresses bias issues in AI that affect fairness in sensitive decision-making applications, though it is incremental as it builds on existing bias evaluation frameworks.
The researchers tackled the problem of gender and racial biases in large language models by conducting a comparative study in professional settings, finding that while gender bias has reduced significantly in newer models, racial bias persists.
Language Models have ushered a new age of AI gaining traction within the NLP community as well as amongst the general population. AI's ability to make predictions, generations and its applications in sensitive decision-making scenarios, makes it even more important to study these models for possible biases that may exist and that can be exaggerated. We conduct a quality comparative study and establish a framework to evaluate language models under the premise of two kinds of biases: gender and race, in a professional setting. We find out that while gender bias has reduced immensely in newer models, as compared to older ones, racial bias still exists.