Impact of Dataset Properties on Membership Inference Vulnerability of Deep Transfer Learning
This work addresses privacy risks in machine learning for practitioners, though it is incremental as it builds on existing MIA and fine-tuning research.
The paper investigates how dataset properties affect the vulnerability of deep transfer learning models to membership inference attacks, finding that vulnerability decreases with a power law as the number of examples per class increases, but requires very large datasets to protect the most vulnerable points.
Membership inference attacks (MIAs) are used to test practical privacy of machine learning models. MIAs complement formal guarantees from differential privacy (DP) under a more realistic adversary model. We analyse MIA vulnerability of fine-tuned neural networks both empirically and theoretically, the latter using a simplified model of fine-tuning. We show that the vulnerability of non-DP models when measured as the attacker advantage at a fixed false positive rate reduces according to a simple power law as the number of examples per class increases. A similar power-law applies even for the most vulnerable points, but the dataset size needed for adequate protection of the most vulnerable points is very large.