A Hybrid Federated Kernel Regularized Least Squares Algorithm
This addresses privacy-preserving model building in critical scenarios like healthcare, but it is incremental as it adapts an existing method to a hybrid federated setting.
The paper tackles the problem of federated learning in hybrid settings where data is distributed across both samples and features, such as clinical and omics data across hospitals and labs. It presents an efficient reformulation of the Kernel Regularized Least Squares algorithm with two variants, validated on established datasets.
Federated learning is becoming an increasingly viable and accepted strategy for building machine learning models in critical privacy-preserving scenarios such as clinical settings. Often, the data involved is not limited to clinical data but also includes additional omics features (e.g. proteomics). Consequently, data is distributed not only across hospitals but also across omics centers, which are labs capable of generating such additional features from biosamples. This scenario leads to a hybrid setting where data is scattered both in terms of samples and features. In this hybrid setting, we present an efficient reformulation of the Kernel Regularized Least Squares algorithm, introduce two variants and validate them using well-established datasets. Lastly, we discuss security measures to defend against possible attacks.