LG MLJan 11, 2025

Reliable Imputed-Sample Assisted Vertical Federated Learning

Yaopei Zeng, Lei Liu, Shaoguo Liu, Hongjian Dou, Baoyuan Wu, Li Liu

arXiv:2501.06429v1h-index: 11ICASSP

Originality Incremental advance

AI Analysis

This addresses the data scarcity issue in VFL for parties with non-overlapping data, offering an incremental improvement over existing methods.

The paper tackles the problem of limited overlapping samples in Vertical Federated Learning (VFL) by proposing a framework that selects reliable imputed samples from non-overlapping data, resulting in a 48% accuracy gain on CIFAR-10 with only 1% overlapping samples.

Vertical Federated Learning (VFL) is a well-known FL variant that enables multiple parties to collaboratively train a model without sharing their raw data. Existing VFL approaches focus on overlapping samples among different parties, while their performance is constrained by the limited number of these samples, leaving numerous non-overlapping samples unexplored. Some previous work has explored techniques for imputing missing values in samples, but often without adequate attention to the quality of the imputed samples. To address this issue, we propose a Reliable Imputed-Sample Assisted (RISA) VFL framework to effectively exploit non-overlapping samples by selecting reliable imputed samples for training VFL models. Specifically, after imputing non-overlapping samples, we introduce evidence theory to estimate the uncertainty of imputed samples, and only samples with low uncertainty are selected. In this way, high-quality non-overlapping samples are utilized to improve VFL model. Experiments on two widely used datasets demonstrate the significant performance gains achieved by the RISA, especially with the limited overlapping samples, e.g., a 48% accuracy gain on CIFAR-10 with only 1% overlapping samples.

View on arXiv PDF

Similar