LG MLAug 14, 2020

Federated Doubly Stochastic Kernel Learning for Vertically Partitioned Data

Bin Gu, Zhiyuan Dang, Xiang Li, Heng Huang

arXiv:2008.06197v116.872 citations

Originality Incremental advance

AI Analysis

This addresses privacy-preserving nonlinear learning for multi-provider data, but it is incremental as it builds on existing federated and kernel methods.

The paper tackles the challenge of training vertically partitioned data with privacy by proposing a federated doubly stochastic kernel learning algorithm, which achieves significantly faster training speeds than state-of-the-art methods while maintaining similar generalization performance.

In a lot of real-world data mining and machine learning applications, data are provided by multiple providers and each maintains private records of different feature sets about common entities. It is challenging to train these vertically partitioned data effectively and efficiently while keeping data privacy for traditional data mining and machine learning algorithms. In this paper, we focus on nonlinear learning with kernels, and propose a federated doubly stochastic kernel learning (FDSKL) algorithm for vertically partitioned data. Specifically, we use random features to approximate the kernel mapping function and use doubly stochastic gradients to update the solutions, which are all computed federatedly without the disclosure of data. Importantly, we prove that FDSKL has a sublinear convergence rate, and can guarantee the data security under the semi-honest assumption. Extensive experimental results on a variety of benchmark datasets show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels, while retaining the similar generalization performance.

View on arXiv PDF

Similar