Conditions for Convergence in Regularized Machine Learning Objectives
This provides a practical guide for machine learning practitioners dealing with convergence issues in distributed settings, but it is incremental as it focuses on summarizing existing knowledge rather than introducing new methods.
The paper tackles the problem of analyzing convergence rates for convex optimization algorithms in distributed computing, where empirical and theoretical analyses diverge due to non-linear slowdowns from communication operations, and it shows the existence of convergence and provides lower bounds for these rates.
Analysis of the convergence rates of modern convex optimization algorithms can be achived through binary means: analysis of emperical convergence, or analysis of theoretical convergence. These two pathways of capturing information diverge in efficacy when moving to the world of distributed computing, due to the introduction of non-intuitive, non-linear slowdowns associated with broadcasting, and in some cases, gathering operations. Despite these nuances in the rates of convergence, we can still show the existence of convergence, and lower bounds for the rates. This paper will serve as a helpful cheat-sheet for machine learning practitioners encountering this problem class in the field.