CLAICYFeb 23

InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation

arXiv:2602.20294v11 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses the need for direct assessment against actual personal data in personality simulation, providing a scalable evaluation framework for researchers in AI and natural language processing.

The paper tackles the problem of evaluating personality simulation in large language models by introducing an interview-grounded framework, using over 671,000 question-answer pairs from 23,000 verified transcripts, and shows that methods based on real interview data outperform those using only biographical profiles or parametric knowledge.

Simulating real personalities with large language models requires grounding generation in authentic personal data. Existing evaluation approaches rely on demographic surveys, personality questionnaires, or short AI-led interviews as proxies, but lack direct assessment against what individuals actually said. We address this gap with an interview-grounded evaluation framework for personality simulation at a large scale. We extract over 671,000 question-answer pairs from 23,000 verified interview transcripts across 1,000 public personalities, each with an average of 11.5 hours of interview content. We propose a multi-dimensional evaluation framework with four complementary metrics measuring content similarity, factual consistency, personality alignment, and factual knowledge retention. Through systematic comparison, we demonstrate that methods grounded in real interview data substantially outperform those relying solely on biographical profiles or the model's parametric knowledge. We further reveal a trade-off in how interview data is best utilized: retrieval-augmented methods excel at capturing personality style and response quality, while chronological-based methods better preserve factual consistency and knowledge retention. Our evaluation framework enables principled method selection based on application requirements, and our empirical findings provide actionable insights for advancing personality simulation research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes