CLSIJul 2, 2019

Danish Stance Classification and Rumour Resolution

arXiv:1907.01304v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses rumour detection for Danish social media users, but is incremental as it applies existing methods to a new language and dataset.

This thesis tackled the problem of rumour veracity prediction by generating a Danish stance-annotated Reddit dataset and implementing stance classification models, with a Linear Support Vector Machine achieving an accuracy of 0.76 and macro F1 of 0.42, and using stance labels in a Hidden Markov Model for veracity prediction achieving accuracies up to 0.83 and F1 up to 0.68.

The Internet is rife with flourishing rumours that spread through microblogs and social media. Recent work has shown that analysing the stance of the crowd towards a rumour is a good indicator for its veracity. One state-of-the-art system uses an LSTM neural network to automatically classify stance for posts on Twitter by considering the context of a whole branch, while another, more simple Decision Tree classifier, performs at least as well by performing careful feature engineering. One approach to predict the veracity of a rumour is to use stance as the only feature for a Hidden Markov Model (HMM). This thesis generates a stance-annotated Reddit dataset for the Danish language, and implements various models for stance classification. Out of these, a Linear Support Vector Machine provides the best results with an accuracy of 0.76 and macro F1 score of 0.42. Furthermore, experiments show that stance labels can be used across languages and platforms with a HMM to predict the veracity of rumours, achieving an accuracy of 0.82 and F1 score of 0.67. Even higher scores are achieved by relying only on the Danish dataset. In this case veracity prediction scores an accuracy of 0.83 and an F1 of 0.68. Finally, when using automatic stance labels for the HMM, only a small drop in performance is observed, showing that the implemented system can have practical applications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes