AICLCYApr 2, 2025

The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data

arXiv:2504.01951v12 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses gender bias in LLMs for online shopping applications, highlighting persistent stereotypes that could affect fairness in AI systems, though it is incremental as it builds on existing bias research.

The study tackled the problem of gender bias in Large Language Models by analyzing their ability to predict gender from online shopping histories, finding that models achieve moderate accuracy but rely on stereotypical associations, with bias-mitigation instructions reducing certainty but not eliminating stereotypes.

With the wide and cross-domain adoption of Large Language Models, it becomes crucial to assess to which extent the statistical correlations in training data, which underlie their impressive performance, hide subtle and potentially troubling biases. Gender bias in LLMs has been widely investigated from the perspectives of works, hobbies, and emotions typically associated with a specific gender. In this study, we introduce a novel perspective. We investigate whether LLMs can predict an individual's gender based solely on online shopping histories and whether these predictions are influenced by gender biases and stereotypes. Using a dataset of historical online purchases from users in the United States, we evaluate the ability of six LLMs to classify gender and we then analyze their reasoning and products-gender co-occurrences. Results indicate that while models can infer gender with moderate accuracy, their decisions are often rooted in stereotypical associations between product categories and gender. Furthermore, explicit instructions to avoid bias reduce the certainty of model predictions, but do not eliminate stereotypical patterns. Our findings highlight the persistent nature of gender biases in LLMs and emphasize the need for robust bias-mitigation strategies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes