LG SISep 7, 2024

Sequential Classification of Misinformation

arXiv:2409.04860v12.6h-index: 13Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more nuanced misinformation detection for social media platforms, offering an incremental improvement over binary classification methods.

The paper tackles the problem of online multiclass classification of misinformation in social media information flow, proposing two sequential detection algorithms that outperform state-of-the-art methods by reducing both classification error and detection time on real-world datasets.

In recent years there have been a growing interest in online auditing of information flow over social networks with the goal of monitoring undesirable effects, such as, misinformation and fake news. Most previous work on the subject, focus on the binary classification problem of classifying information as fake or genuine. Nonetheless, in many practical scenarios, the multi-class/label setting is of particular importance. For example, it could be the case that a social media platform may want to distinguish between ``true", ``partly-true", and ``false" information. Accordingly, in this paper, we consider the problem of online multiclass classification of information flow. To that end, driven by empirical studies on information flow over real-world social media networks, we propose a probabilistic information flow model over graphs. Then, the learning task is to detect the label of the information flow, with the goal of minimizing a combination of the classification error and the detection time. For this problem, we propose two detection algorithms; the first is based on the well-known multiple sequential probability ratio test, while the second is a novel graph neural network based sequential decision algorithm. For both algorithms, we prove several strong statistical guarantees. We also construct a data driven algorithm for learning the proposed probabilistic model. Finally, we test our algorithms over two real-world datasets, and show that they outperform other state-of-the-art misinformation detection algorithms, in terms of detection time and classification error.

View on arXiv PDF Code

Similar