CRAIJun 12, 2024

Ents: An Efficient Three-party Training Framework for Decision Trees by Communication Optimization

arXiv:2406.07948v59 citations
Originality Incremental advance
AI Analysis

This work addresses communication bottlenecks for parties needing to train decision trees on distributed private data with privacy preservation, representing a strong incremental improvement over existing methods.

The paper tackles the communication inefficiency in multi-party training frameworks for decision trees by proposing Ents, an efficient three-party framework that optimizes communication through secure radix sort protocols and share conversion, resulting in improvements of 5.5x to 9.3x in communication sizes, 3.9x to 5.3x in communication rounds, and 3.5x to 6.7x in training time.

Multi-party training frameworks for decision trees based on secure multi-party computation enable multiple parties to train high-performance models on distributed private data with privacy preservation. The training process essentially involves frequent dataset splitting according to the splitting criterion (e.g. Gini impurity). However, existing multi-party training frameworks for decision trees demonstrate communication inefficiency due to the following issues: (1) They suffer from huge communication overhead in securely splitting a dataset with continuous attributes. (2) They suffer from huge communication overhead due to performing almost all the computations on a large ring to accommodate the secure computations for the splitting criterion. In this paper, we are motivated to present an efficient three-party training framework, namely Ents, for decision trees by communication optimization. For the first issue, we present a series of training protocols based on the secure radix sort protocols to efficiently and securely split a dataset with continuous attributes. For the second issue, we propose an efficient share conversion protocol to convert shares between a small ring and a large ring to reduce the communication overhead incurred by performing almost all the computations on a large ring. Experimental results from eight widely used datasets show that Ents outperforms state-of-the-art frameworks by $5.5\times \sim 9.3\times$ in communication sizes and $3.9\times \sim 5.3\times$ in communication rounds. In terms of training time, Ents yields an improvement of $3.5\times \sim 6.7\times$. To demonstrate its practicality, Ents requires less than three hours to securely train a decision tree on a widely used real-world dataset (Skin Segmentation) with more than 245,000 samples in the WAN setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes