LG AIJan 31, 2023

Mind the (optimality) Gap: A Gap-Aware Learning Rate Scheduler for Adversarial Nets

MIT

arXiv:2302.00089v13.84 citationsh-index: 17

Originality Incremental advance

AI Analysis

This addresses the problem of instability and high tuning requirements in adversarial nets for researchers and practitioners in generative modeling and domain adaptation, representing an incremental improvement with a novel scheduler.

The paper tackles the challenge of training adversarial nets by introducing a novel learning rate scheduler that dynamically adapts the adversary's learning rate to maintain balance, resulting in reduced tuning needs (e.g., one-tenth on CelebA) and improvements such as up to 27% in Frechet Inception Distance for image generation and 3% in test accuracy for domain adaptation.

Adversarial nets have proved to be powerful in various domains including generative modeling (GANs), transfer learning, and fairness. However, successfully training adversarial nets using first-order methods remains a major challenge. Typically, careful choices of the learning rates are needed to maintain the delicate balance between the competing networks. In this paper, we design a novel learning rate scheduler that dynamically adapts the learning rate of the adversary to maintain the right balance. The scheduler is driven by the fact that the loss of an ideal adversarial net is a constant known a priori. The scheduler is thus designed to keep the loss of the optimized adversarial net close to that of an ideal network. We run large-scale experiments to study the effectiveness of the scheduler on two popular applications: GANs for image generation and adversarial nets for domain adaptation. Our experiments indicate that adversarial nets trained with the scheduler are less likely to diverge and require significantly less tuning. For example, on CelebA, a GAN with the scheduler requires only one-tenth of the tuning budget needed without a scheduler. Moreover, the scheduler leads to statistically significant improvements in model quality, reaching up to $27\%$ in Frechet Inception Distance for image generation and $3\%$ in test accuracy for domain adaptation.

View on arXiv PDF

Similar