Matrix Sketching for Secure Collaborative Machine Learning
This addresses privacy concerns for participants in collaborative learning, offering a practical defense against inference attacks.
The paper tackles the problem of privacy leakage in collaborative machine learning from communicated gradients and parameters, proposing Double-Blind Collaborative Learning (DBCL) that uses random matrix sketching to prevent gradient-based attacks without increasing costs or hurting test accuracy.
Collaborative learning allows participants to jointly train a model without data sharing. To update the model parameters, the central server broadcasts model parameters to the clients, and the clients send updating directions such as gradients to the server. While data do not leave a client device, the communicated gradients and parameters will leak a client's privacy. Attacks that infer clients' privacy from gradients and parameters have been developed by prior work. Simple defenses such as dropout and differential privacy either fail to defend the attacks or seriously hurt test accuracy. We propose a practical defense which we call Double-Blind Collaborative Learning (DBCL). The high-level idea is to apply random matrix sketching to the parameters (aka weights) and re-generate random sketching after each iteration. DBCL prevents clients from conducting gradient-based privacy inferences which are the most effective attacks. DBCL works because from the attacker's perspective, sketching is effectively random noise that outweighs the signal. Notably, DBCL does not much increase computation and communication costs and does not hurt test accuracy at all.