CRLGJan 8, 2019

Contamination Attacks and Mitigation in Multi-Party Machine Learning

arXiv:1901.02402v175 citations
AI Analysis

This addresses security and privacy risks for parties collaborating in data-sharing ML scenarios, though it is incremental as it builds on existing adversarial training methods.

The paper tackles the problem of malicious parties contaminating models in multi-party machine learning by providing tainted data, and demonstrates that adversarial training can defend against such attacks while ensuring party-level membership privacy.

Machine learning is data hungry; the more data a model has access to in training, the more likely it is to perform well at inference time. Distinct parties may want to combine their local data to gain the benefits of a model trained on a large corpus of data. We consider such a case: parties get access to the model trained on their joint data but do not see each others individual datasets. We show that one needs to be careful when using this multi-party model since a potentially malicious party can taint the model by providing contaminated data. We then show how adversarial training can defend against such attacks by preventing the model from learning trends specific to individual parties data, thereby also guaranteeing party-level membership privacy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes