Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization
This work addresses faster optimization for machine learning practitioners, but it is incremental as it builds on existing proximal point and stochastic methods.
The paper tackled the problem of accelerating empirical risk minimization (ERM) by developing a family of accelerated stochastic algorithms that improve running times for ERM, including linear least-squares regression, across various settings. The result is faster algorithms that avoid bias from regularization, with empirical demonstrations of stability.
We develop a family of accelerated stochastic algorithms that minimize sums of convex functions. Our algorithms improve upon the fastest running time for empirical risk minimization (ERM), and in particular linear least-squares regression, across a wide range of problem settings. To achieve this, we establish a framework based on the classical proximal point algorithm. Namely, we provide several algorithms that reduce the minimization of a strongly convex function to approximate minimizations of regularizations of the function. Using these results, we accelerate recent fast stochastic algorithms in a black-box fashion. Empirically, we demonstrate that the resulting algorithms exhibit notions of stability that are advantageous in practice. Both in theory and in practice, the provided algorithms reap the computational benefits of adding a large strongly convex regularization term, without incurring a corresponding bias to the original problem.