CLNov 25, 2024

Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings

Carolin M. Schuster, Maria-Alexandra Dinisor, Shashwat Ghatiwala, Georg Groh

arXiv:2411.16527v28.711 citationsh-index: 3Has CodeNoDaLiDa/Baltic-HLT

Originality Synthesis-oriented

AI Analysis

This work addresses the need for accessible bias descriptions in AI to communicate risks and encourage mitigation, though it is incremental as it applies existing social psychology concepts to LLMs.

The paper tackled the problem of bias in large language models by proposing bias profiles based on stereotype dimensions from social psychology to describe discriminatory properties intuitively. They investigated gender bias in contextual embeddings across contexts and layers, generating profiles for twelve LLMs to expose and visualize bias.

Large language models (LLMs) are the foundation of the current successes of artificial intelligence (AI), however, they are unavoidably biased. To effectively communicate the risks and encourage mitigation efforts these models need adequate and intuitive descriptions of their discriminatory properties, appropriate for all audiences of AI. We suggest bias profiles with respect to stereotype dimensions based on dictionaries from social psychology research. Along these dimensions we investigate gender bias in contextual embeddings, across contexts and layers, and generate stereotype profiles for twelve different LLMs, demonstrating their intuition and use case for exposing and visualizing bias.

View on arXiv PDF Code

Similar