SplitOut: Out-of-the-Box Training-Hijacking Detection in Split Learning via Outlier Detection
This addresses security vulnerabilities in split learning for clients, offering a simpler and more reliable detection method compared to prior heuristic-based approaches.
The paper tackles the problem of training-hijacking attacks in split learning, where servers can compromise client models, and shows that using an out-of-the-box outlier detection method can detect these attacks with almost-zero false positive rates.
Split learning enables efficient and privacy-aware training of a deep neural network by splitting a neural network so that the clients (data holders) compute the first layers and only share the intermediate output with the central compute-heavy server. This paradigm introduces a new attack medium in which the server has full control over what the client models learn, which has already been exploited to infer the private data of clients and to implement backdoors in the client models. Although previous work has shown that clients can successfully detect such training-hijacking attacks, the proposed methods rely on heuristics, require tuning of many hyperparameters, and do not fully utilize the clients' capabilities. In this work, we show that given modest assumptions regarding the clients' compute capabilities, an out-of-the-box outlier detection method can be used to detect existing training-hijacking attacks with almost-zero false positive rates. We conclude through experiments on different tasks that the simplicity of our approach we name \textit{SplitOut} makes it a more viable and reliable alternative compared to the earlier detection methods.