A Unified Approach to Adaptive Regularization in Online and Stochastic Optimization
This work provides a theoretical unification for researchers in optimization, but it is incremental as it primarily refines existing analyses rather than introducing new methods.
The authors tackled the problem of analyzing adaptive regularization in online and stochastic optimization by proposing a unified framework that simplifies convergence proofs for existing algorithms like AdaGrad and Online Newton Step, resulting in clearer insights into their updates.
We describe a framework for deriving and analyzing online optimization algorithms that incorporate adaptive, data-dependent regularization, also termed preconditioning. Such algorithms have been proven useful in stochastic optimization by reshaping the gradients according to the geometry of the data. Our framework captures and unifies much of the existing literature on adaptive online methods, including the AdaGrad and Online Newton Step algorithms as well as their diagonal versions. As a result, we obtain new convergence proofs for these algorithms that are substantially simpler than previous analyses. Our framework also exposes the rationale for the different preconditioned updates used in common stochastic optimization methods.