Towards Active Participant Centric Vertical Federated Learning: Some Representations May Be All You Need
This work addresses inefficiencies in VFL for scenarios with partially aligned data, offering a more practical solution for distributed machine learning applications.
The paper tackles the challenges of high communication costs and operational complexity in Vertical Federated Learning (VFL) with unaligned data partitions by introducing Active Participant Centric VFL (APC-VFL), which uses local representation learning and knowledge distillation to achieve superior performance in F1, accuracy, and communication efficiency as aligned data decreases.
Existing Vertical FL (VFL) methods often struggle with realistic and unaligned data partitions, and incur into high communication costs and significant operational complexity. This work introduces a novel approach to VFL, Active Participant Centric VFL (APC-VFL), that excels in scenarios when data samples among participants are partially aligned at training. Among its strengths, APC-VFL only requires a single communication step with the active participant. This is made possible through a local and unsupervised representation learning stage at each participant followed by a knowledge distillation step in the active participant. Compared to other VFL methods such as SplitNN or VFedTrans, APC-VFL consistently outperforms them across three popular VFL datasets in terms of F1, accuracy and communication costs as the ratio of aligned data is reduced.