Global Layers: Non-IID Tabular Federated Learning
This addresses the challenge of non-IID data in federated learning for tabular datasets, offering a novel solution for scenarios with mixed input/output spaces, though it appears incremental in its application to specific benchmarks.
The paper tackles data heterogeneity in federated learning for tabular data by proposing Global Layers (GL), a partial model personalization method that supports client-exclusive features and classes, achieving better outcomes than Federated Averaging and local-only training in benchmark experiments, with some clients outperforming centralized baselines.
Data heterogeneity between clients remains a key challenge in Federated Learning (FL), particularly in the case of tabular data. This work presents Global Layers (GL), a novel partial model personalization method robust in the presence of joint distribution $P(X,Y)$ shift and mixed input/output spaces $X \times Y$ across clients. To the best of our knowledge, GL is the first method capable of supporting both client-exclusive features and classes. We introduce two new benchmark experiments for tabular FL naturally partitioned from existing real world datasets: i) UCI Covertype split into 4 clients by "wilderness area" feature, and ii) UCI Heart Disease, SAHeart, UCI Heart Failure, each as clients. Empirical results in these experiments in the full-participant setting show that GL achieves better outcomes than Federated Averaging (FedAvg) and local-only training, with some clients even performing better than their centralized baseline.