LGAIOCMLJun 1, 2019

Adaptive Online Learning for Gradient-Based Optimizers

arXiv:1906.00290v1
Originality Incremental advance
AI Analysis

This work addresses the problem of algorithm selection in online convex optimization for practitioners, though it is incremental as it builds upon existing adaptive methods.

The paper tackles the challenge of selecting appropriate online convex optimization algorithms by proposing a framework that adaptively competes with the best expert algorithm in a family, generalizing methods like MetaGrad and Ader, and demonstrates empirical results in learning the best regularizer for multiclass learning on simplex and l2-ball domains.

As application demands for online convex optimization accelerate, the need for designing new methods that simultaneously cover a large class of convex functions and impose the lowest possible regret is highly rising. Known online optimization methods usually perform well only in specific settings, and their performance depends highly on the geometry of the decision space and cost functions. However, in practice, lack of such geometric information leads to confusion in using the appropriate algorithm. To address this issue, some adaptive methods have been proposed that focus on adaptively learning parameters such as step size, Lipschitz constant, and strong convexity coefficient, or on specific parametric families such as quadratic regularizers. In this work, we generalize these methods and propose a framework that competes with the best algorithm in a family of expert algorithms. Our framework includes many of the well-known adaptive methods including MetaGrad, MetaGrad+C, and Ader. We also introduce a second algorithm that computationally outperforms our first algorithm with at most a constant factor increase in regret. Finally, as a representative application of our proposed algorithm, we study the problem of learning the best regularizer from a family of regularizers for Online Mirror Descent. Empirically, we support our theoretical findings in the problem of learning the best regularizer on the simplex and $l_2$-ball in a multiclass learning problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes