Privacy-Preserving XGBoost Inference
This work addresses privacy concerns for users who need accurate predictions but are unwilling to share sensitive data with commercial services, representing an incremental improvement in privacy-preserving machine learning.
The paper tackles the problem of sensitive predictive queries in machine learning by proposing a privacy-preserving XGBoost inference algorithm, which has been implemented and evaluated on AWS SageMaker, showing it is efficient enough for real-world production environments.
Although machine learning (ML) is widely used for predictive tasks, there are important scenarios in which ML cannot be used or at least cannot achieve its full potential. A major barrier to adoption is the sensitive nature of predictive queries. Individual users may lack sufficiently rich datasets to train accurate models locally but also be unwilling to send sensitive queries to commercial services that vend such models. One central goal of privacy-preserving machine learning (PPML) is to enable users to submit encrypted queries to a remote ML service, receive encrypted results, and decrypt them locally. We aim at developing practical solutions for real-world privacy-preserving ML inference problems. In this paper, we propose a privacy-preserving XGBoost prediction algorithm, which we have implemented and evaluated empirically on AWS SageMaker. Experimental results indicate that our algorithm is efficient enough to be used in real ML production environments.