CRLGFeb 20, 2023

Poisoning Web-Scale Training Datasets is Practical

DeepMindETH Zurich
arXiv:2302.10149v2329 citationsh-index: 63
AI Analysis

This work addresses a critical security vulnerability in distributed, web-scale training datasets used for deep learning, posing a direct threat to model integrity and reliability.

The paper introduces two practical dataset poisoning attacks, split-view and frontrunning, that exploit mutable internet content and trust assumptions to inject malicious examples into web-scale datasets like LAION-400M and COYO-700M, demonstrating the ability to poison 0.01% of these datasets for as low as $60 USD.

Deep learning models are often trained on distributed, web-scale datasets crawled from the internet. In this paper, we introduce two new dataset poisoning attacks that intentionally introduce malicious examples to a model's performance. Our attacks are immediately practical and could, today, poison 10 popular datasets. Our first attack, split-view poisoning, exploits the mutable nature of internet content to ensure a dataset annotator's initial view of the dataset differs from the view downloaded by subsequent clients. By exploiting specific invalid trust assumptions, we show how we could have poisoned 0.01% of the LAION-400M or COYO-700M datasets for just $60 USD. Our second attack, frontrunning poisoning, targets web-scale datasets that periodically snapshot crowd-sourced content -- such as Wikipedia -- where an attacker only needs a time-limited window to inject malicious examples. In light of both attacks, we notify the maintainers of each affected dataset and recommended several low-overhead defenses.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes