CLCYSep 11, 2023

Personality Detection and Analysis using Twitter Data

arXiv:2309.05497v17 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This work addresses a data scarcity issue for researchers in computational linguistics and psychology, enabling more robust personality detection applications, but it is incremental as it focuses on dataset creation rather than novel methods.

The authors tackled the problem of limited data for personality detection from text by collecting and releasing the largest automatically curated dataset for Myers-Briggs personality type prediction, containing 152 million tweets and 56 thousand data points, and performed qualitative and quantitative analyses to show how results align with natural intuition.

Personality types are important in various fields as they hold relevant information about the characteristics of a human being in an explainable format. They are often good predictors of a person's behaviors in a particular environment and have applications ranging from candidate selection to marketing and mental health. Recently automatic detection of personality traits from texts has gained significant attention in computational linguistics. Most personality detection and analysis methods have focused on small datasets making their experimental observations often limited. To bridge this gap, we focus on collecting and releasing the largest automatically curated dataset for the research community which has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task. We perform a series of extensive qualitative and quantitative studies on our dataset to analyze the data patterns in a better way and infer conclusions. We show how our intriguing analysis results often follow natural intuition. We also perform a series of ablation studies to show how the baselines perform for our dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes