DCLGJul 6, 2015

Revisiting Large Scale Distributed Machine Learning

arXiv:1507.01461v11 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental survey that addresses the problem of scaling machine learning for distributed data in domains like personal healthcare.

The paper provides a survey of distributed machine learning algorithms for handling large-scale, high-dimensional data, with a focus on personal healthcare applications, and proposes future directions emphasizing security and low communication overhead in client-server architectures.

Nowadays, with the widespread of smartphones and other portable gadgets equipped with a variety of sensors, data is ubiquitous available and the focus of machine learning has shifted from being able to infer from small training samples to dealing with large scale high-dimensional data. In domains such as personal healthcare applications, which motivates this survey, distributed machine learning is a promising line of research, both for scaling up learning algorithms, but mostly for dealing with data which is inherently produced at different locations. This report offers a thorough overview of and state-of-the-art algorithms for distributed machine learning, for both supervised and unsupervised learning, ranging from simple linear logistic regression to graphical models and clustering. We propose future directions for most categories, specific to the potential personal healthcare applications. With this in mind, the report focuses on how security and low communication overhead can be assured in the specific case of a strictly client-server architectural model. As particular directions we provides an exhaustive presentation of an empirical clustering algorithm, k-windows, and proposed an asynchronous distributed machine learning algorithm that would scale well and also would be computationally cheap and easy to implement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes