LGAIApr 3, 2025

Tree-based Models for Vertical Federated Learning: A Survey

arXiv:2504.02285v18 citationsh-index: 18ACM Computing Surveys
Originality Synthesis-oriented
AI Analysis

This is an incremental survey that synthesizes existing research on tree-based models for VFL, aiding researchers and practitioners in understanding and implementing these methods.

This survey tackles the application of tree-based models in vertical federated learning (VFL) by categorizing them into feature-gathering and label-scattering types, analyzing their protocols, privacy, and applications, and providing empirical comparisons through experiments.

Tree-based models have achieved great success in a wide range of real-world applications due to their effectiveness, robustness, and interpretability, which inspired people to apply them in vertical federated learning (VFL) scenarios in recent years. In this paper, we conduct a comprehensive study to give an overall picture of applying tree-based models in VFL, from the perspective of their communication and computation protocols. We categorize tree-based models in VFL into two types, i.e., feature-gathering models and label-scattering models, and provide a detailed discussion regarding their characteristics, advantages, privacy protection mechanisms, and applications. This study also focuses on the implementation of tree-based models in VFL, summarizing several design principles for better satisfying various requirements from both academic research and industrial deployment. We conduct a series of experiments to provide empirical observations on the differences and advances of different types of tree-based models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes