CL AI HCSep 20, 2024

'Since Lawyers are Males..': Examining Implicit Gender Bias in Hindi Language Generation by LLMs

Ishika Joshi, Ishita Gupta, Adrita Dey, Tapan Parikh

arXiv:2409.13484v111 citationsh-index: 3

Originality Synthesis-oriented

AI Analysis

This research highlights a critical fairness issue for users of generative AI in underrepresented languages like Hindi, though it is incremental as it extends known bias analysis to a new language.

The study examined implicit gender bias in Hindi text generation by LLMs, finding a significant bias of 87.8% in Hindi compared to 33.4% in English for GPT-4o, with Hindi responses often relying on stereotypes related to occupations and social hierarchies.

Large Language Models (LLMs) are increasingly being used to generate text across various languages, for tasks such as translation, customer support, and education. Despite these advancements, LLMs show notable gender biases in English, which become even more pronounced when generating content in relatively underrepresented languages like Hindi. This study explores implicit gender biases in Hindi text generation and compares them to those in English. We developed Hindi datasets inspired by WinoBias to examine stereotypical patterns in responses from models like GPT-4o and Claude-3 sonnet. Our results reveal a significant gender bias of 87.8% in Hindi, compared to 33.4% in English GPT-4o generation, with Hindi responses frequently relying on gender stereotypes related to occupations, power hierarchies, and social class. This research underscores the variation in gender biases across languages and provides considerations for navigating these biases in generative AI systems.

View on arXiv PDF

Similar