LGCLIRMar 22, 2021

Detection of fake news on CoViD-19 on Web Search Engines

arXiv:2103.11804v241 citations
AI Analysis

This addresses the infodemic of misleading information during the COVID-19 pandemic for users relying on search engines, but it is incremental as it applies existing methods to a new domain.

The study tackled the problem of detecting fake news related to COVID-19 on web search engines by analyzing textual and URL features, showing that combining these features can improve classifier performance, though no concrete numbers were provided.

In early January 2020, after China reported the first cases of the new coronavirus (SARS-CoV-2) in the city of Wuhan, unreliable and not fully accurate information has started spreading faster than the virus itself. Alongside this pandemic, people have experienced a parallel infodemic, i.e., an overabundance of information, some of which misleading or even harmful, that has widely spread around the globe. Although Social Media are increasingly being used as information source, Web Search Engines, like Google or Yahoo!, still represent a powerful and trustworthy resource for finding information on the Web. This is due to their capability to capture the largest amount of information, helping users quickly identify the most relevant, useful, although not always the most reliable, results for their search queries. This study aims to detect potential misleading and fake contents by capturing and analysing textual information, which flow through Search Engines. By using a real-world dataset associated with recent CoViD-19 pandemic, we first apply re-sampling techniques for class imbalance, then we use existing Machine Learning algorithms for classification of not reliable news. By extracting lexical and host-based features of associated Uniform Resource Locators (URLs) for news articles, we show that the proposed methods, so common in phishing and malicious URLs detection, can improve the efficiency and performance of classifiers. Based on these findings, we suggest that the use of both textual and URLs features can improve the effectiveness of fake news detection methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes