CLNov 19, 2024

Strengthening False Information Propagation Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques in comparison to BERT

arXiv:2411.12703v3h-index: 32025 International Conference on Quantum Photonics, Artificial Intelligence, and Networking (QPAIN)
Originality Synthesis-oriented
AI Analysis

This addresses the need for reliable misinformation detection systems for online platforms, but it is incremental as it compares existing methods without introducing new ones.

This study tackled the problem of detecting fake news by comparing SVM with various text vectorization methods against BERT, finding that BERT achieved 99.98% accuracy and an F1-score of 0.9998, while SVM with BoW vectorization also performed well at 99.81% accuracy and an F1-score of 0.9980.

The rapid spread of misinformation, particularly through online platforms, underscores the urgent need for reliable detection systems. This study explores the utilization of machine learning and natural language processing, specifically Support Vector Machines (SVM) and BERT, to detect fake news. We employ three distinct text vectorization methods for SVM: Term Frequency Inverse Document Frequency (TF-IDF), Word2Vec, and Bag of Words (BoW), evaluating their effectiveness in distinguishing between genuine and fake news. Additionally, we compare these methods against the transformer large language model, BERT. Our comprehensive approach includes detailed preprocessing steps, rigorous model implementation, and thorough evaluation to determine the most effective techniques. The results demonstrate that while BERT achieves superior accuracy with 99.98% and an F1-score of 0.9998, the SVM model with a linear kernel and BoW vectorization also performs exceptionally well, achieving 99.81% accuracy and an F1-score of 0.9980. These findings highlight that, despite BERT's superior performance, SVM models with BoW and TF-IDF vectorization methods come remarkably close, offering highly competitive performance with the advantage of lower computational requirements.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes