Collaborative Drug Discovery: Inference-level Data Protection Perspective
This work addresses privacy concerns for pharmaceutical companies using collaborative ML to accelerate drug candidate selection, but it is incremental as it adapts existing attacks and protections to a specific domain.
The paper tackled the problem of unintended data leakage in collaborative machine learning platforms for drug discovery by assessing privacy risks and testing protection techniques, resulting in the customization of several state-of-the-art inference attacks and mitigation strategies.
Pharmaceutical industry can better leverage its data assets to virtualize drug discovery through a collaborative machine learning platform. On the other hand, there are non-negligible risks stemming from the unintended leakage of participants' training data, hence, it is essential for such a platform to be secure and privacy-preserving. This paper describes a privacy risk assessment for collaborative modeling in the preclinical phase of drug discovery to accelerate the selection of promising drug candidates. After a short taxonomy of state-of-the-art inference attacks we adopt and customize several to the underlying scenario. Finally we describe and experiments with a handful of relevant privacy protection techniques to mitigate such attacks.