LGDCFeb 9, 2022

Vertical Federated Learning: Challenges, Methodologies and Experiments

arXiv:2202.04309v2117 citations
AI Analysis

This work addresses privacy and efficiency issues in distributed machine learning for applications with vertically partitioned data, but it is incremental as it builds on existing VFL concepts.

The paper tackles the unique challenges of vertical federated learning (VFL), such as privacy risks and high costs, by proposing a general framework and solutions, with experiments on real-life datasets demonstrating their effectiveness.

Recently, federated learning (FL) has emerged as a promising distributed machine learning (ML) technology, owing to the advancing computational and sensing capacities of end-user devices, however with the increasing concerns on users' privacy. As a special architecture in FL, vertical FL (VFL) is capable of constructing a hyper ML model by embracing sub-models from different clients. These sub-models are trained locally by vertically partitioned data with distinct attributes. Therefore, the design of VFL is fundamentally different from that of conventional FL, raising new and unique research issues. In this paper, we aim to discuss key challenges in VFL with effective solutions, and conduct experiments on real-life datasets to shed light on these issues. Specifically, we first propose a general framework on VFL, and highlight the key differences between VFL and conventional FL. Then, we discuss research challenges rooted in VFL systems under four aspects, i.e., security and privacy risks, expensive computation and communication costs, possible structural damage caused by model splitting, and system heterogeneity. Afterwards, we develop solutions to addressing the aforementioned challenges, and conduct extensive experiments to showcase the effectiveness of our proposed solutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes