LGOct 18, 2023

Effective and Efficient Federated Tree Learning on Hybrid Data

arXiv:2310.11865v25 citationsh-index: 15
Originality Highly original
AI Analysis

This addresses a practical scenario in federated learning for parties with hybrid data, offering an efficient solution with low overhead.

The paper tackles the problem of federated tree learning on hybrid data, where data from different parties differ in both features and samples, by proposing HybridTree, which achieves comparable accuracy to centralized settings with up to 8 times speedup over baselines.

Federated learning has emerged as a promising distributed learning paradigm that facilitates collaborative learning among multiple parties without transferring raw data. However, most existing federated learning studies focus on either horizontal or vertical data settings, where the data of different parties are assumed to be from the same feature or sample space. In practice, a common scenario is the hybrid data setting, where data from different parties may differ both in the features and samples. To address this, we propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data. We observe the existence of consistent split rules in trees. With the help of these split rules, we theoretically show that the knowledge of parties can be incorporated into the lower layers of a tree. Based on our theoretical analysis, we propose a layer-level solution that does not need frequent communication traffic to train a tree. Our experiments demonstrate that HybridTree can achieve comparable accuracy to the centralized setting with low computational and communication overhead. HybridTree can achieve up to 8 times speedup compared with the other baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes