Personal-ITY: A Novel YouTube-based Corpus for Personality Prediction in Italian
This work addresses the need for better resources for personality prediction in Italian, but it is incremental as it builds on existing methods with a new dataset.
The authors tackled the problem of personality prediction in Italian by creating a new YouTube-based corpus, Personal-ITY, which includes more authors and a different genre than existing resources, and they conducted preliminary experiments showing that some personality types are easier to predict than others.
We present a novel corpus for personality prediction in Italian, containing a larger number of authors and a different genre compared to previously available resources. The corpus is built exploiting Distant Supervision, assigning Myers-Briggs Type Indicator (MBTI) labels to YouTube comments, and can lend itself to a variety of experiments. We report on preliminary experiments on Personal-ITY, which can serve as a baseline for future work, showing that some types are easier to predict than others, and discussing the perks of cross-dataset prediction.