LG MLAug 21, 2020

Defending Distributed Classifiers Against Data Poisoning Attacks

Sandamal Weerasinghe, Tansu Alpcan, Sarah M. Erfani, Christopher Leckie

arXiv:2008.09284v12.31 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses security concerns for SVM-based systems in engineering and life-critical applications, offering a defense against targeted attacks, though it appears incremental as it builds on existing LID metrics.

The paper tackles the vulnerability of Support Vector Machines (SVMs) to data poisoning attacks by developing a novel defense algorithm using a new approximation of Local Intrinsic Dimensionality (K-LID) to de-emphasize suspicious samples, resulting in a substantial reduction in classification error rates by 10% on average in experiments.

Support Vector Machines (SVMs) are vulnerable to targeted training data manipulations such as poisoning attacks and label flips. By carefully manipulating a subset of training samples, the attacker forces the learner to compute an incorrect decision boundary, thereby cause misclassifications. Considering the increased importance of SVMs in engineering and life-critical applications, we develop a novel defense algorithm that improves resistance against such attacks. Local Intrinsic Dimensionality (LID) is a promising metric that characterizes the outlierness of data samples. In this work, we introduce a new approximation of LID called K-LID that uses kernel distance in the LID calculation, which allows LID to be calculated in high dimensional transformed spaces. We introduce a weighted SVM against such attacks using K-LID as a distinguishing characteristic that de-emphasizes the effect of suspicious data samples on the SVM decision boundary. Each sample is weighted on how likely its K-LID value is from the benign K-LID distribution rather than the attacked K-LID distribution. We then demonstrate how the proposed defense can be applied to a distributed SVM framework through a case study on an SDR-based surveillance system. Experiments with benchmark data sets show that the proposed defense reduces classification error rates substantially (10% on average).

View on arXiv PDF Code

Similar