LGMLSep 15, 2020

Data Poisoning Attacks on Regression Learning and Corresponding Defenses

arXiv:2009.07008v127 citations
Originality Incremental advance
AI Analysis

This addresses security threats in mission-critical regression systems like medical dosing and cyber-physical control, though it is incremental by extending poisoning studies from classification to regression.

The paper tackles data poisoning attacks on regression learning, showing that inserting just 2% of poison samples increases the mean squared error to 150%, and proposes a defense strategy that effectively mitigates these attacks across 26 datasets.

Adversarial data poisoning is an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset. So far, it has been studied mostly for classification, even though regression learning is used in many mission critical systems (such as dosage of medication, control of cyber-physical systems and managing power supply). Therefore, in the present research, we aim to evaluate all aspects of data poisoning attacks on regression learning, exceeding previous work both in terms of breadth and depth. We present realistic scenarios in which data poisoning attacks threaten production systems and introduce a novel black-box attack, which is then applied to a real-word medical use-case. As a result, we observe that the mean squared error (MSE) of the regressor increases to 150 percent due to inserting only two percent of poison samples. Finally, we present a new defense strategy against the novel and previous attacks and evaluate it thoroughly on 26 datasets. As a result of the conducted experiments, we conclude that the proposed defence strategy effectively mitigates the considered attacks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes