LGCRDCJan 26, 2022

An Efficient and Robust System for Vertically Federated Random Forest

arXiv:2201.10761v115 citations
Originality Incremental advance
AI Analysis

This addresses the efficiency bottleneck for organizations using vertically federated learning to build machine learning models while preserving data privacy, though it appears incremental as it builds on existing federated learning concepts.

The authors tackled the efficiency problem in vertically federated learning by developing a system for random forest that achieves 5× and 83× speedup over the state-of-the-art SecureBoost model for training and serving tasks while maintaining similar accuracy.

As there is a growing interest in utilizing data across multiple resources to build better machine learning models, many vertically federated learning algorithms have been proposed to preserve the data privacy of the participating organizations. However, the efficiency of existing vertically federated learning algorithms remains to be a big problem, especially when applied to large-scale real-world datasets. In this paper, we present a fast, accurate, scalable and yet robust system for vertically federated random forest. With extensive optimization, we achieved $5\times$ and $83\times$ speed up over the SOTA SecureBoost model \cite{cheng2019secureboost} for training and serving tasks. Moreover, the proposed system can achieve similar accuracy but with favorable scalability and partition tolerance. Our code has been made public to facilitate the development of the community and the protection of user data privacy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes