Cryptotree: fast and accurate predictions on encrypted structured data
This enables secure and accurate predictions on private data such as financial or medical records, representing a novel method for a known bottleneck in homomorphic encryption.
The paper tackles the problem of applying powerful machine learning models like Random Forests to encrypted data while preserving privacy, achieving faster inference and better prediction accuracy than the original Random Forest on encrypted data.
Applying machine learning algorithms to private data, such as financial or medical data, while preserving their confidentiality, is a difficult task. Homomorphic Encryption (HE) is acknowledged for its ability to allow computation on encrypted data, where both the input and output are encrypted, which therefore enables secure inference on private data. Nonetheless, because of the constraints of HE, such as its inability to evaluate non-polynomial functions or to perform arbitrary matrix multiplication efficiently, only inference of linear models seem usable in practice in the HE paradigm so far. In this paper, we propose Cryptotree, a framework that enables the use of Random Forests (RF), a very powerful learning procedure compared to linear regression, in the context of HE. To this aim, we first convert a regular RF to a Neural RF, then adapt this to fit the HE scheme CKKS, which allows HE operations on real values. Through SIMD operations, we are able to have quick inference and prediction results better than the original RF on encrypted data.