CRLGMLOct 31, 2018

A Mixture Model Based Defense for Data Poisoning Attacks Against Naive Bayes Spam Filters

arXiv:1811.00121v11 citations
AI Analysis

This addresses a security vulnerability in spam filtering systems, offering a practical defense against poisoning attacks, though it is incremental as it builds on existing mixture model techniques.

The paper tackles the susceptibility of naive Bayes spam filters to data poisoning attacks by proposing a mixture model defense, which isolates the attack in a separate component and leaves the original spam model largely unaffected, achieving near-complete isolation of attacks as demonstrated on the TREC 2005 spam corpus.

Naive Bayes spam filters are highly susceptible to data poisoning attacks. Here, known spam sources/blacklisted IPs exploit the fact that their received emails will be treated as (ground truth) labeled spam examples, and used for classifier training (or re-training). The attacking source thus generates emails that will skew the spam model, potentially resulting in great degradation in classifier accuracy. Such attacks are successful mainly because of the poor representation power of the naive Bayes (NB) model, with only a single (component) density to represent spam (plus a possible attack). We propose a defense based on the use of a mixture of NB models. We demonstrate that the learned mixture almost completely isolates the attack in a second NB component, with the original spam component essentially unchanged by the attack. Our approach addresses both the scenario where the classifier is being re-trained in light of new data and, significantly, the more challenging scenario where the attack is embedded in the original spam training set. Even for weak attack strengths, BIC-based model order selection chooses a two-component solution, which invokes the mixture-based defense. Promising results are presented on the TREC 2005 spam corpus.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes