PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs
This work addresses the problem of developing personality-aware conversational agents for AI systems, but it is incremental as it builds on existing methods for annotation and prediction.
The authors tackled the lack of personality annotations in speech datasets for personality-aware conversation agents by creating a pipeline to preprocess raw audio into an annotated dialogue dataset and using large language models to predict personality traits. Their system achieved stronger alignment with human judgments compared to existing approaches.
Despite significant progress in neural spoken dialog systems, personality-aware conversation agents -- capable of adapting behavior based on personalities -- remain underexplored due to the absence of personality annotations in speech datasets. We propose a pipeline that preprocesses raw audio recordings to create a dialogue dataset annotated with timestamps, response types, and emotion/sentiment labels. We employ an automatic speech recognition (ASR) system to extract transcripts and timestamps, then generate conversation-level annotations. Leveraging these annotations, we design a system that employs large language models to predict conversational personality. Human evaluators were engaged to identify conversational characteristics and assign personality labels. Our analysis demonstrates that the proposed system achieves stronger alignment with human judgments compared to existing approaches.