CLJul 13, 2018

Multi-task dialog act and sentiment recognition on Mastodon

arXiv:1807.05013v11096 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the critical issue of reproducible research in social media by providing an open-source alternative to Twitter data, though it is incremental in its application of existing methods to a new dataset.

The authors tackled the problem of data reproducibility in social media research by creating a new annotated corpus from Mastodon, a Twitter-like platform with open licenses, and trained a multi-task hierarchical recurrent network for joint sentiment and dialog act recognition, achieving efficient transfer learning between tasks.

Because of license restrictions, it often becomes impossible to strictly reproduce most research results on Twitter data already a few months after the creation of the corpus. This situation worsened gradually as time passes and tweets become inaccessible. This is a critical issue for reproducible and accountable research on social media. We partly solve this challenge by annotating a new Twitter-like corpus from an alternative large social medium with licenses that are compatible with reproducible experiments: Mastodon. We manually annotate both dialogues and sentiments on this corpus, and train a multi-task hierarchical recurrent network on joint sentiment and dialog act recognition. We experimentally demonstrate that transfer learning may be efficiently achieved between both tasks, and further analyze some specific correlations between sentiments and dialogues on social media. Both the annotated corpus and deep network are released with an open-source license.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes