CYCLADES: Conflict-free Asynchronous Machine Learning
This work addresses efficiency bottlenecks in machine learning training for practitioners by providing a faster, conflict-free alternative to existing asynchronous methods, though it is incremental as it builds on HOGWILD!-type algorithms.
The paper tackles the problem of parallelizing stochastic optimization algorithms in shared memory settings by introducing CYCLADES, a conflict-free asynchronous framework that eliminates memory locking and conflicts, resulting in up to 40% speedup over HOGWILD! for SGD and up to 5x gains for variance reduction algorithms on sparse datasets.
We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting. CYCLADES is asynchronous during shared model updates, and requires no memory locking mechanisms, similar to HOGWILD!-type algorithms. Unlike HOGWILD!, CYCLADES introduces no conflicts during the parallel execution, and offers a black-box analysis for provable speedups across a large family of algorithms. Due to its inherent conflict-free nature and cache locality, our multi-core implementation of CYCLADES consistently outperforms HOGWILD!-type algorithms on sufficiently sparse datasets, leading to up to 40% speedup gains compared to the HOGWILD! implementation of SGD, and up to 5x gains over asynchronous implementations of variance reduction algorithms.