CLOct 21, 2019

Localization of Fake News Detection via Multitask Transfer Learning

Jan Christian Blaise Cruz, Julianne Agatha Tan, Charibeth Cheng

arXiv:1910.09295v330.11003 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses fake news detection for low-resource language communities, offering incremental improvements through dataset creation and method adaptation.

The authors tackled the problem of fake news detection in low-resource languages by creating a benchmark dataset for Filipino and using transfer learning with multitask augmentation, achieving up to 96% accuracy and reducing error by 14% compared to baselines.

The use of the internet as a fast medium of spreading fake news reinforces the need for computational tools that combat it. Techniques that train fake news classifiers exist, but they all assume an abundance of resources including large labeled datasets and expert-curated corpora, which low-resource languages may not have. In this work, we make two main contributions: First, we alleviate resource scarcity by constructing the first expertly-curated benchmark dataset for fake news detection in Filipino, which we call "Fake News Filipino." Second, we benchmark Transfer Learning (TL) techniques and show that they can be used to train robust fake news classifiers from little data, achieving 91% accuracy on our fake news dataset, reducing the error by 14% compared to established few-shot baselines. Furthermore, lifting ideas from multitask learning, we show that augmenting transformer-based transfer techniques with auxiliary language modeling losses improves their performance by adapting to writing style. Using this, we improve TL performance by 4-6%, achieving an accuracy of 96% on our best model. Lastly, we show that our method generalizes well to different types of news articles, including political news, entertainment news, and opinion articles.

View on arXiv PDF Code

Similar