AI CL HCJun 7, 2023

Personality testing of Large Language Models: Limited temporal stability, but highlighted prosociality

Bojana Bodroza, Bojana M. Dinic, Ljubisa Bojic

arXiv:2306.04308v318.330 citationsh-index: 14

Originality Synthesis-oriented

AI Analysis

This research addresses the need to understand LLMs' societal impact by evaluating their reliability in simulating stable personality traits, which is crucial for AI safety and user interaction, though it is incremental in building on existing personality assessment methods.

The study assessed the temporal stability and inter-rater agreement of personality traits in seven large language models (LLMs) using personality instruments at two time points, finding varying levels of agreement (e.g., higher for Llama3 and GPT-4o, lower for GPT-4 and Gemini) and that LLMs generally exhibited a prosocial personality profile with higher agreeableness and conscientiousness and lower Machiavellianism.

As Large Language Models (LLMs) continue to gain popularity due to their human-like traits and the intimacy they offer to users, their societal impact inevitably expands. This leads to the rising necessity for comprehensive studies to fully understand LLMs and reveal their potential opportunities, drawbacks, and overall societal impact. With that in mind, this research conducted an extensive investigation into seven LLM's, aiming to assess the temporal stability and inter-rater agreement on their responses on personality instruments in two time points. In addition, LLMs personality profile was analyzed and compared to human normative data. The findings revealed varying levels of inter-rater agreement in the LLMs responses over a short time, with some LLMs showing higher agreement (e.g., LIama3 and GPT-4o) compared to others (e.g., GPT-4 and Gemini). Furthermore, agreement depended on used instruments as well as on domain or trait. This implies the variable robustness in LLMs' ability to reliably simulate stable personality characteristics. In the case of scales which showed at least fair agreement, LLMs displayed mostly a socially desirable profile in both agentic and communal domains, as well as a prosocial personality profile reflected in higher agreeableness and conscientiousness and lower Machiavellianism. Exhibiting temporal stability and coherent responses on personality traits is crucial for AI systems due to their societal impact and AI safety concerns.

View on arXiv PDF

Similar