CRAug 7, 2016

An intelligent classification model for phishing email detection

arXiv:1608.02196v180 citations
Originality Incremental advance
AI Analysis

This addresses cybersecurity threats for email users, though it appears incremental as it builds on existing classification methods with enhancements.

The paper tackles phishing email detection by developing an intelligent classification model that uses text processing and data mining techniques, achieving 0.991 accuracy with Random Forest and 0.984 with J48 on an accredited dataset.

Phishing attacks are one of the trending cyber attacks that apply socially engineered messages that are communicated to people from professional hackers aiming at fooling users to reveal their sensitive information, the most popular communication channel to those messages is through users emails. This paper presents an intelligent classification model for detecting phishing emails using knowledge discovery, data mining and text processing techniques. This paper introduces the concept of phishing terms weighting which evaluates the weight of phishing terms in each email. The pre processing phase is enhanced by applying text stemming and WordNet ontology to enrich the model with word synonyms. The model applied the knowledge discovery procedures using five popular classification algorithms and achieved a notable enhancement in classification accuracy, 0.991 accuracy was achieved using the Random Forest algorithm and 0.984 using J48, which is to our knowledge the highest accuracy rate for an accredited data set. This paper also presents a comparative study with similar proposed classification techniques.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes