CR LGJan 6, 2021

Phishing Attacks and Websites Classification Using Machine Learning and Multiple Datasets (A Comparative Analysis)

Sohail Ahmed Khan, Wasiq Khan, Abir Hussain

arXiv:2101.02552v127 citations

Originality Incremental advance

AI Analysis

This research addresses the problem of accurately identifying phishing attacks for individuals and organizations by comparing machine learning algorithms.

This study analyzes various machine learning algorithms for classifying phishing websites across multiple datasets. It found that Random Forest and Artificial Neural Networks achieved over 97% accuracy using identified significant features.

Phishing attacks are the most common type of cyber-attacks used to obtain sensitive information and have been affecting individuals as well as organisations across the globe. Various techniques have been proposed to identify the phishing attacks specifically, deployment of machine intelligence in recent years. However, the deployed algorithms and discriminating factors are very diverse in existing works. In this study, we present a comprehensive analysis of various machine learning algorithms to evaluate their performances over multiple datasets. We further investigate the most significant features within multiple datasets and compare the classification performance with the reduced dimensional datasets. The statistical results indicate that random forest and artificial neural network outperform other classification algorithms, achieving over 97% accuracy using the identified features.

View on arXiv PDF

Similar