LGDCMLJun 7, 2019

Towards Sharp Analysis for Distributed Learning with Random Features

arXiv:1906.03155v54 citations
Originality Incremental advance
AI Analysis

This work addresses computational efficiency in distributed machine learning for researchers and practitioners, but it is incremental as it builds on existing methods.

The paper tackles the problem of distributed learning with random features in non-attainable cases, extending optimal rates and reducing computational costs via data-dependent strategies and unlabeled data, with experiments validating theoretical findings.

In recent studies, the generalization properties for distributed learning and random features assumed the existence of the target concept over the hypothesis space. However, this strict condition is not applicable to the more common non-attainable case. In this paper, using refined proof techniques, we first extend the optimal rates for distributed learning with random features to the non-attainable case. Then, we reduce the number of required random features via data-dependent generating strategy, and improve the allowed number of partitions with additional unlabeled data. Theoretical analysis shows these techniques remarkably reduce computational cost while preserving the optimal generalization accuracy under standard assumptions. Finally, we conduct several experiments on both simulated and real-world datasets, and the empirical results validate our theoretical findings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes