LGOCMar 2, 2022

Adaptive Gradient Methods with Local Guarantees

Princeton
arXiv:2203.01400v313 citationsh-index: 82
AI Analysis

This addresses the need for robust optimization methods in machine learning by automating learning rate schedules, though it appears incremental as it builds on existing adaptive gradient techniques.

The paper tackles the problem of learning local preconditioners that adapt during optimization, proposing an adaptive gradient method with provable adaptive regret guarantees against the best local preconditioner. The method achieves comparable and stable accuracy to fine-tuned optimizers on vision and language tasks without manual learning rate tuning.

Adaptive gradient methods are the method of choice for optimization in machine learning and used to train the largest deep models. In this paper we study the problem of learning a local preconditioner, that can change as the data is changing along the optimization trajectory. We propose an adaptive gradient method that has provable adaptive regret guarantees vs. the best local preconditioner. To derive this guarantee, we prove a new adaptive regret bound in online learning that improves upon previous adaptive online learning methods. We demonstrate the robustness of our method in automatically choosing the optimal learning rate schedule for popular benchmarking tasks in vision and language domains. Without the need to manually tune a learning rate schedule, our method can, in a single run, achieve comparable and stable task accuracy as a fine-tuned optimizer.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes