Encrypted machine learning of molecular quantum properties

arXiv:2212.04322v23 citationsh-index: 8
AI Analysis

This addresses privacy concerns for commercial applications in chemistry, though it is incremental as it highlights a major cost barrier.

The authors tackled the problem of privacy in machine learning for molecular quantum properties by implementing encrypted models using oblivious transfer, but found that encrypted predictions are a million times more expensive than non-encrypted ones.

Large machine learning models with improved predictions have become widely available in the chemical sciences. Unfortunately, these models do not protect the privacy necessary within commercial settings, prohibiting the use of potentially extremely valuable data by others. Encrypting the prediction process can solve this problem by double-blind model evaluation and prohibits the extraction of training or query data. However, contemporary ML models based on fully homomorphic encryption or federated learning are either too expensive for practical use or have to trade higher speed for weaker security. We have implemented secure and computationally feasible encrypted machine learning models using oblivious transfer enabling and secure predictions of molecular quantum properties across chemical compound space. However, we find that encrypted predictions using kernel ridge regression models are a million times more expensive than without encryption. This demonstrates a dire need for a compact machine learning model architecture, including molecular representation and kernel matrix size, that minimizes model evaluation costs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes