CLAIApr 23, 2024

Evaluating the Efficacy of Large Language Models in Identifying Phishing Attempts

arXiv:2404.15485v39 citationsh-index: 32HSI
Originality Synthesis-oriented
AI Analysis

It addresses the problem of phishing detection for cybersecurity applications, but it is incremental as it applies existing LLMs to a known task.

This paper analyzed the effectiveness of 15 Large Language Models (LLMs) in detecting phishing attempts, specifically '419 Scam' emails, and found that ChatGPT 3.5, GPT-3.5-Turbo-Instruct, and ChatGPT were the most effective models.

Phishing, a prevalent cybercrime tactic for decades, remains a significant threat in today's digital world. By leveraging clever social engineering elements and modern technology, cybercrime targets many individuals, businesses, and organizations to exploit trust and security. These cyber-attackers are often disguised in many trustworthy forms to appear as legitimate sources. By cleverly using psychological elements like urgency, fear, social proof, and other manipulative strategies, phishers can lure individuals into revealing sensitive and personalized information. Building on this pervasive issue within modern technology, this paper aims to analyze the effectiveness of 15 Large Language Models (LLMs) in detecting phishing attempts, specifically focusing on a randomized set of "419 Scam" emails. The objective is to determine which LLMs can accurately detect phishing emails by analyzing a text file containing email metadata based on predefined criteria. The experiment concluded that the following models, ChatGPT 3.5, GPT-3.5-Turbo-Instruct, and ChatGPT, were the most effective in detecting phishing emails.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes