Personality Structured Interview for Large Language Model Simulation in Personality Research
This addresses the need for better LLM simulations in psychometrics, offering a domain-specific tool for personality research, though it is incremental in applying existing interview methods to LLMs.
The paper tackled the problem of large language models (LLMs) failing to generate heterogeneous, human-like data for personality research by using a theory-informed Personality Structured Interview (PSI) based on 357 real-human transcripts. Results showed that structured interviews improved human-like heterogeneity in LLM-simulated data and predicted personality-related behavioral outcomes.
Although psychometrics researchers have recently explored the use of large language models (LLMs) as proxies for human participants, LLMs often fail to generate heterogeneous data with human-like diversity, which diminishes their value in advancing social science research. To address these challenges, we explored the potential of the theory-informed Personality Structured Interview (PSI) as a tool for simulating human responses in personality research. In this approach, the simulation is grounded in nuanced real-human interview transcripts that target the personality construct of interest. We have provided a growing set of 357 structured interview transcripts from a representative sample, each containing an individual's response to 32 open-ended questions carefully designed to gather theory-based personality evidence. Additionally, grounded in psychometric research, we have summarized an evaluation framework to systematically validate LLM-generated psychometric data. Results from three experiments demonstrate that well-designed structured interviews could improve human-like heterogeneity in LLM-simulated personality data and predict personality-related behavioral outcomes (i.e., organizational citizenship behaviors and counterproductive work behavior). We further discuss the role of theory-informed structured interviews in LLM-based simulation and outline a general framework for designing structured interviews to simulate human-like data for psychometric research.