CR LGAug 20, 2021

UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks Against Split Learning

Ege Erdogan, Alptekin Kupcu, A. Ercument Cicek

arXiv:2108.09033v232.1122 citationsHas Code

Originality Incremental advance

AI Analysis

This work reveals critical security flaws in split learning, posing a problem for users relying on it for privacy in distributed deep learning, and is incremental as it builds on known vulnerabilities.

The authors demonstrated that split learning, a distributed training scheme intended to protect privacy, is vulnerable to attacks where a server can recover client input samples and infer labels with perfect accuracy, exposing serious risks to data and model privacy.

Training deep neural networks often forces users to work in a distributed or outsourced setting, accompanied with privacy concerns. Split learning aims to address this concern by distributing the model among a client and a server. The scheme supposedly provides privacy, since the server cannot see the clients' models and inputs. We show that this is not true via two novel attacks. (1) We show that an honest-but-curious split learning server, equipped only with the knowledge of the client neural network architecture, can recover the input samples and obtain a functionally similar model to the client model, without being detected. (2) We show that if the client keeps hidden only the output layer of the model to "protect" the private labels, the honest-but-curious server can infer the labels with perfect accuracy. We test our attacks using various benchmark datasets and against proposed privacy-enhancing extensions to split learning. Our results show that plaintext split learning can pose serious risks, ranging from data (input) privacy to intellectual property (model parameters), and provide no more than a false sense of security.

View on arXiv PDF Code

Similar