CLAIDec 2, 2024

The "LLM World of Words" English free association norms generated by large language models

arXiv:2412.01330v19 citationsh-index: 5Sci Data
Originality Incremental advance
AI Analysis

This addresses the need for comparable datasets to study LLM biases in cognitive psychology and linguistics, though it is incremental as it builds on existing human norms.

The authors tackled the lack of large-scale LLM-generated free association norms comparable to human norms by creating the 'LLM World of Words' dataset with approximately 12,000 cue words from three LLMs, and used it to construct cognitive network models for investigating implicit biases like gender stereotypes in both humans and LLMs.

Free associations have been extensively used in cognitive psychology and linguistics for studying how conceptual knowledge is organized. Recently, the potential of applying a similar approach for investigating the knowledge encoded in LLMs has emerged, specifically as a method for investigating LLM biases. However, the absence of large-scale LLM-generated free association norms that are comparable with human-generated norms is an obstacle to this new research direction. To address this limitation, we create a new dataset of LLM-generated free association norms modeled after the "Small World of Words" (SWOW) human-generated norms consisting of approximately 12,000 cue words. We prompt three LLMs, namely Mistral, Llama3, and Haiku, with the same cues as those in the SWOW norms to generate three novel comparable datasets, the "LLM World of Words" (LWOW). Using both SWOW and LWOW norms, we construct cognitive network models of semantic memory that represent the conceptual knowledge possessed by humans and LLMs. We demonstrate how these datasets can be used for investigating implicit biases in humans and LLMs, such as the harmful gender stereotypes that are prevalent both in society and LLM outputs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes