CRLGOct 6, 2022

Federated Boosted Decision Trees with Differential Privacy

Oxford
arXiv:2210.02910v147 citationsh-index: 87Has Code
Originality Incremental advance
AI Analysis

This addresses the need for scalable, secure, and efficient privacy-preserving machine learning for distributed data, particularly for tabular data applications, though it is incremental as it builds on existing differentially private decision tree approaches.

The paper tackled the problem of training Gradient Boosted Decision Tree (GBDT) models like XGBoost in federated settings with formal privacy guarantees, and achieved very high utility while maintaining strong levels of differential privacy.

There is great demand for scalable, secure, and efficient privacy-preserving machine learning models that can be trained over distributed data. While deep learning models typically achieve the best results in a centralized non-secure setting, different models can excel when privacy and communication constraints are imposed. Instead, tree-based approaches such as XGBoost have attracted much attention for their high performance and ease of use; in particular, they often achieve state-of-the-art results on tabular data. Consequently, several recent works have focused on translating Gradient Boosted Decision Tree (GBDT) models like XGBoost into federated settings, via cryptographic mechanisms such as Homomorphic Encryption (HE) and Secure Multi-Party Computation (MPC). However, these do not always provide formal privacy guarantees, or consider the full range of hyperparameters and implementation settings. In this work, we implement the GBDT model under Differential Privacy (DP). We propose a general framework that captures and extends existing approaches for differentially private decision trees. Our framework of methods is tailored to the federated setting, and we show that with a careful choice of techniques it is possible to achieve very high utility while maintaining strong levels of privacy.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes