CR LGOct 24, 2023

Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks

Xinglong Chang, Katharina Dost, Gillian Dobbie, Jörg Wicker

arXiv:2310.16224v12.31 citationsh-index: 4

Originality Highly original

AI Analysis

This addresses the challenge of limited applicability in real-world scenarios for existing detectors, offering a more general solution for security in ML systems.

The paper tackles the problem of detecting poisoning attacks in machine learning training data by introducing DIVA, a fully-agnostic framework that estimates classifier accuracy on clean data using complexity measures, achieving detection without relying on specific data types, models, or attacks.

The performance of machine learning models depends on the quality of the underlying data. Malicious actors can attack the model by poisoning the training data. Current detectors are tied to either specific data types, models, or attacks, and therefore have limited applicability in real-world scenarios. This paper presents a novel fully-agnostic framework, DIVA (Detecting InVisible Attacks), that detects attacks solely relying on analyzing the potentially poisoned data set. DIVA is based on the idea that poisoning attacks can be detected by comparing the classifier's accuracy on poisoned and clean data and pre-trains a meta-learner using Complexity Measures to estimate the otherwise unknown accuracy on a hypothetical clean dataset. The framework applies to generic poisoning attacks. For evaluation purposes, in this paper, we test DIVA on label-flipping attacks.

View on arXiv PDF

Similar