LGMar 12, 2022

Optimizer Amalgamation

arXiv:2203.06474v25 citationsh-index: 81Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge for researchers and practitioners in choosing optimizers, offering a novel but incremental approach to improving optimization methods.

The paper tackles the problem of selecting an appropriate optimizer by proposing Optimizer Amalgamation, which combines multiple teacher optimizers into a single student optimizer, resulting in stronger problem-specific performance as shown in experiments.

Selecting an appropriate optimizer for a given problem is of major interest for researchers and practitioners. Many analytical optimizers have been proposed using a variety of theoretical and empirical approaches; however, none can offer a universal advantage over other competitive optimizers. We are thus motivated to study a new problem named Optimizer Amalgamation: how can we best combine a pool of "teacher" optimizers into a single "student" optimizer that can have stronger problem-specific performance? In this paper, we draw inspiration from the field of "learning to optimize" to use a learnable amalgamation target. First, we define three differentiable amalgamation mechanisms to amalgamate a pool of analytical optimizers by gradient descent. Then, in order to reduce variance of the amalgamation process, we also explore methods to stabilize the amalgamation process by perturbing the amalgamation target. Finally, we present experiments showing the superiority of our amalgamated optimizer compared to its amalgamated components and learning to optimize baselines, and the efficacy of our variance reducing perturbations. Our code and pre-trained models are publicly available at http://github.com/VITA-Group/OptimizerAmalgamation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes