CRCVApr 12, 2022

Machine Learning Security against Data Poisoning: Are We There Yet?

arXiv:2204.05986v355 citationsh-index: 75
Originality Synthesis-oriented
AI Analysis

This addresses the problem of ensuring ML model trustworthiness against malicious data manipulation for researchers and practitioners, but it is incremental as it primarily reviews existing work.

The paper reviews data poisoning attacks that compromise machine learning models by manipulating training data to reduce performance, manipulate predictions, or implant backdoors, and discusses mitigation strategies using security principles and ML-oriented defenses.

The recent success of machine learning (ML) has been fueled by the increasing availability of computing power and large amounts of data in many different applications. However, the trustworthiness of the resulting models can be compromised when such data is maliciously manipulated to mislead the learning process. In this article, we first review poisoning attacks that compromise the training data used to learn ML models, including attacks that aim to reduce the overall performance, manipulate the predictions on specific test samples, and even implant backdoors in the model. We then discuss how to mitigate these attacks using basic security principles, or by deploying ML-oriented defensive mechanisms. We conclude our article by formulating some relevant open challenges which are hindering the development of testing methods and benchmarks suitable for assessing and improving the trustworthiness of ML models against data poisoning attacks

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes