CLApr 24, 2024

The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

Oxford
arXiv:2404.16019v2126 citationsh-index: 28Advances in Neural Information Processing Systems 37
AI Analysis

This work addresses the challenge of ensuring diverse and representative human feedback for LLM alignment, which is crucial for developers and researchers aiming to build more inclusive and personalized AI systems, though it is incremental as it builds on existing alignment methods with new data.

The authors tackled the problem of understanding how human feedback for aligning large language models (LLMs) varies by sociodemographics and preferences by introducing the PRISM dataset, which includes 1,500 participants from 75 countries and 8,011 conversations with 21 LLMs, revealing the need for careful consideration of feedback sources in subjective and multicultural contexts.

Human feedback is central to the alignment of Large Language Models (LLMs). However, open questions remain about methods (how), domains (where), people (who) and objectives (to what end) of feedback processes. To navigate these questions, we introduce PRISM, a dataset that maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, to their contextual preferences and fine-grained feedback in 8,011 live conversations with 21 LLMs. With PRISM, we contribute (i) wider geographic and demographic participation in feedback; (ii) census-representative samples for two countries (UK, US); and (iii) individualised ratings that link to detailed participant profiles, permitting personalisation and attribution of sample artefacts. We target subjective and multicultural perspectives on value-laden and controversial issues, where we expect interpersonal and cross-cultural disagreement. We use PRISM in three case studies to demonstrate the need for careful consideration of which humans provide what alignment data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes