CR LGFeb 26, 2025

Truth in Text: A Meta-Analysis of ML-Based Cyber Information Influence Detection Approaches

arXiv:2503.22686v1h-index: 1

Originality Synthesis-oriented

AI Analysis

It addresses the challenge of disinformation detection for researchers and policymakers by synthesizing existing literature, though it is incremental as it aggregates prior work without introducing new methods.

This meta-analysis evaluated the effectiveness of 81 machine learning techniques for detecting disinformation, finding a mean accuracy of 79.18% with most exceeding 80%, but revealed inconsistencies and high variance within model subgroups.

Cyber information influence, or disinformation in general terms, is widely regarded as one of the biggest threats to social progress and government stability. From US presidential elections to European Union referendums and down to regional news reporting of wildfires, lies and post-truths have normalized radical decision-making. Accordingly, there has been an explosion in research seeking to detect disinformation in online media. The frontier of disinformation detection research is leveraging a variety of ML techniques such as traditional ML algorithms like Support Vector Machines, Random Forest, and Naïve Bayes. Other research has applied deep learning models including Convolutional Neural Networks, Long Short-Term Memory networks, and transformer-based architectures. Despite the overall success of such techniques, the literature demonstrates inconsistencies when viewed holistically which limits our understanding of the true effectiveness. Accordingly, this work employed a two-stage meta-analysis to (a) demonstrate an overall meta statistic for ML model effectiveness in detecting disinformation and (b) investigate the same by subgroups of ML model types. The study found the majority of the 81 ML detection techniques sampled have greater than an 80\% accuracy with a Mean sample effectiveness of 79.18\% accuracy. Meanwhile, subgroups demonstrated no statistically significant difference between-approaches but revealed high within-group variance. Based on the results, this work recommends future work in replication and development of detection methods operating at the ML model level.

View on arXiv PDF

Similar