CLNov 1, 2018

GlobalTrait: Personality Alignment of Multilingual Word Embeddings

Farhad Bin Siddique, Dario Bertero, Pascale Fung

arXiv:1811.00240v20.31 citations

Originality Incremental advance

AI Analysis

This addresses personality analysis in low-resource languages by enabling transfer from high-resource ones, though it is incremental as it builds on existing multilingual embedding techniques.

The paper tackled the problem of multilingual personality trait recognition by proposing GlobalTrait, a personality alignment method for word embeddings, which improved average F-score from 65 to 73.4 across three languages.

We propose a multilingual model to recognize Big Five Personality traits from text data in four different languages: English, Spanish, Dutch and Italian. Our analysis shows that words having a similar semantic meaning in different languages do not necessarily correspond to the same personality traits. Therefore, we propose a personality alignment method, GlobalTrait, which has a mapping for each trait from the source language to the target language (English), such that words that correlate positively to each trait are close together in the multilingual vector space. Using these aligned embeddings for training, we can transfer personality related training features from high-resource languages such as English to other low-resource languages, and get better multilingual results, when compared to using simple monolingual and unaligned multilingual embeddings. We achieve an average F-score increase (across all three languages except English) from 65 to 73.4 (+8.4), when comparing our monolingual model to multilingual using CNN with personality aligned embeddings. We also show relatively good performance in the regression tasks, and better classification results when evaluating our model on a separate Chinese dataset.

View on arXiv PDF

Similar