Big5PersonalityEssays: Introducing a Novel Synthetic Generated Dataset Consisting of Short State-of-Consciousness Essays Annotated Based on the Five Factor Model of Personality
This addresses the problem of limited data availability for researchers in psychology and AI, though it is incremental as it focuses on dataset creation rather than novel methods.
The study tackled the lack of datasets for computational psychology by introducing a synthetic dataset of short essays annotated with personality traits based on the Five Factor Model, resulting in a new resource for analysis and AI usage.
Given the high advances of large language models (LLM) it is of vital importance to study their behaviors and apply their utility in all kinds of scientific fields. Psychology has been, in recent years, poorly approached using novel computational tools. One of the reasons is the high complexity of the data required for a proper analysis. Moreover, psychology, with a focus on psychometry, has few datasets available for analysis and artificial intelligence usage. Because of these facts, this study introduces a synthethic database of short essays labeled based on the five factor model (FFM) of personality traits.