CL CY LGOct 3, 2022

SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis

Jiaxin Pei, Vítor Silva, Maarten Bos, Yozon Liu, Leonardo Neves, David Jurgens, Francesco Barbieri

Stanford

arXiv:2210.01108v23.429 citationsh-index: 42

Originality Synthesis-oriented

AI Analysis

This work provides a new dataset and benchmark for multilingual intimacy analysis, which is incremental as it extends existing research to more languages and social media contexts.

The authors tackled the problem of analyzing intimacy in multilingual tweets by creating MINT, a dataset of 13,372 tweets across 10 languages, and benchmarked popular multilingual pre-trained language models on it.

We propose MINT, a new Multilingual INTimacy analysis dataset covering 13,372 tweets in 10 languages including English, French, Spanish, Italian, Portuguese, Korean, Dutch, Chinese, Hindi, and Arabic. We benchmarked a list of popular multilingual pre-trained language models. The dataset is released along with the SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis (https://sites.google.com/umich.edu/semeval-2023-tweet-intimacy).

View on arXiv PDF

Similar