CLSep 30, 2025

PrimeX: A Dataset of Worldview, Opinion, and Explanation

Rik Koncel-Kedziorski, Brihi Joshi, Tim Paek

arXiv:2510.00174v16.72 citationsh-index: 3EMNLP

Originality Synthesis-oriented

AI Analysis

This dataset benefits NLP and psychological research by enabling better user representation in language models, though it is incremental as it builds on prior opinion prediction work.

The authors tackled the problem of personalizing language models by introducing PrimeX, a dataset of 858 US residents' survey opinions, written explanations, and worldview assessments, showing its value for improved alignment.

As the adoption of language models advances, so does the need to better represent individual users to the model. Are there aspects of an individual's belief system that a language model can utilize for improved alignment? Following prior research, we investigate this question in the domain of opinion prediction by developing PrimeX, a dataset of public opinion survey data from 858 US residents with two additional sources of belief information: written explanations from the respondents for why they hold specific opinions, and the Primal World Belief survey for assessing respondent worldview. We provide an extensive initial analysis of our data and show the value of belief explanations and worldview for personalizing language models. Our results demonstrate how the additional belief information in PrimeX can benefit both the NLP and psychological research communities, opening up avenues for further study.

View on arXiv PDF

Similar