DBLGAug 4, 2015

Parameter Database : Data-centric Synchronization for Scalable Machine Learning

arXiv:1508.00703v1
Originality Incremental advance
AI Analysis

This addresses synchronization bottlenecks for scalable machine learning, though it appears incremental as it builds on existing synchronization methods.

The paper tackles the problem of inefficient synchronization in distributed machine learning by proposing a data-centric framework that relaxes the bulk synchronization parallel paradigm, resulting in substantial improvements over BSP while ensuring sequential correctness.

We propose a new data-centric synchronization framework for carrying out of machine learning (ML) tasks in a distributed environment. Our framework exploits the iterative nature of ML algorithms and relaxes the application agnostic bulk synchronization parallel (BSP) paradigm that has previously been used for distributed machine learning. Data-centric synchronization complements function-centric synchronization based on using stale updates to increase the throughput of distributed ML computations. Experiments to validate our framework suggest that we can attain substantial improvement over BSP while guaranteeing sequential correctness of ML tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes