LGCLCRFeb 13, 2024

Prompted Contextual Vectors for Spear-Phishing Detection

arXiv:2402.08309v323 citationsh-index: 46
AI Analysis

This addresses a critical security challenge for cybersecurity professionals, offering a novel method to counter escalating spear-phishing threats, though it is incremental in applying LLMs to a specific domain.

The paper tackles the problem of detecting LLM-generated spear-phishing emails by proposing a novel document vectorization method that uses an ensemble of LLMs to quantify persuasion principles, achieving a 91% F1 score on a unique dataset.

Spear-phishing attacks present a significant security challenge, with large language models (LLMs) escalating the threat by generating convincing emails and facilitating target reconnaissance. To address this, we propose a detection approach based on a novel document vectorization method that utilizes an ensemble of LLMs to create representation vectors. By prompting LLMs to reason and respond to human-crafted questions, we quantify the presence of common persuasion principles in the email's content, producing prompted contextual document vectors for a downstream supervised machine learning model. We evaluate our method using a unique dataset generated by a proprietary system that automates target reconnaissance and spear-phishing email creation. Our method achieves a 91\% F1 score in identifying LLM-generated spear-phishing emails, with the training set comprising only traditional phishing and benign emails. Key contributions include a novel document vectorization method utilizing LLM reasoning, a publicly available dataset of high-quality spear-phishing emails, and the demonstrated effectiveness of our method in detecting such emails. This methodology can be utilized for various document classification tasks, particularly in adversarial problem domains.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes