CRLGOct 12, 2021

Federated Phish Bowl: LSTM-Based Decentralized Phishing Email Detection

arXiv:2110.06025v2
Originality Synthesis-oriented
AI Analysis

This work addresses phishing detection for email users by providing a privacy-preserving solution, though it is incremental as it applies existing federated learning and LSTM techniques to this domain.

The paper tackles the problem of phishing email detection by proposing a decentralized framework called Federated Phish Bowl (FedPB) that uses LSTM and federated learning to enable collaborative detection while preserving privacy, achieving a prediction accuracy of 83% that is competitive with centralized methods.

With increasingly more sophisticated phishing campaigns in recent years, phishing emails lure people using more legitimate-looking personal contexts. To tackle this problem, instead of traditional heuristics-based algorithms, more adaptive detection systems such as natural language processing (NLP)-powered approaches are essential to understanding phishing text representations. Nevertheless, concerns surrounding the collection of phishing data that might cover confidential information hinder the effectiveness of model learning. We propose a decentralized phishing email detection framework called Federated Phish Bowl (FedPB) which facilitates collaborative phishing detection with privacy. In particular, we devise a knowledge-sharing mechanism with federated learning (FL). Using long short-term memory (LSTM) for phishing detection, the framework adapts by sharing a global word embedding matrix across the clients, with each client running its local model with Non-IID data. We collected the most recent phishing samples to study the effectiveness of the proposed method using different client numbers and data distributions. The results show that FedPB can attain a competitive performance with a centralized phishing detector, with generality to various cases of FL retaining a prediction accuracy of 83%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes