Contrastive Federated Learning with Tabular Data Silos
This addresses privacy-preserving machine learning for organizations with segmented tabular data, though it appears incremental as it builds on federated and contrastive learning.
The paper tackles the problem of learning from vertically partitioned data silos with sample misalignment and privacy concerns by proposing Contrastive Federated Learning (CFL), which outperforms existing methods without sharing data.
Learning from vertical partitioned data silos is challenging due to the segmented nature of data, sample misalignment, and strict privacy concerns. Federated learning has been proposed as a solution. However, sample misalignment across silos often hinders optimal model performance and suggests data sharing within the model, which breaks privacy. Our proposed solution is Contrastive Federated Learning with Tabular Data Silos (CFL), which offers a solution for data silos with sample misalignment without the need for sharing original or representative data to maintain privacy. CFL begins with local acquisition of contrastive representations of the data within each silo and aggregates knowledge from other silos through the federated learning algorithm. Our experiments demonstrate that CFL solves the limitations of existing algorithms for data silos and outperforms existing tabular contrastive learning. CFL provides performance improvements without loosening privacy.