LGCVMLJul 15, 2020

Gradient-based Hyperparameter Optimization Over Long Horizons

arXiv:2007.07869v222 citationsHas Code
AI Analysis

This addresses a bottleneck in hyperparameter optimization for long-horizon tasks, offering a practical solution with broad applicability in machine learning, though it is incremental as it builds on existing gradient-based methods.

The paper tackles the problem of gradient-based hyperparameter optimization for tasks with long horizons, which suffers from memory scaling and gradient degradation issues, by proposing forward-mode differentiation with sharing (FDS), achieving significant performance improvements over greedy alternatives and 20x speedups compared to state-of-the-art black-box methods on CIFAR-10.

Gradient-based hyperparameter optimization has earned a widespread popularity in the context of few-shot meta-learning, but remains broadly impractical for tasks with long horizons (many gradient steps), due to memory scaling and gradient degradation issues. A common workaround is to learn hyperparameters online, but this introduces greediness which comes with a significant performance drop. We propose forward-mode differentiation with sharing (FDS), a simple and efficient algorithm which tackles memory scaling issues with forward-mode differentiation, and gradient degradation issues by sharing hyperparameters that are contiguous in time. We provide theoretical guarantees about the noise reduction properties of our algorithm, and demonstrate its efficiency empirically by differentiating through $\sim 10^4$ gradient steps of unrolled optimization. We consider large hyperparameter search ranges on CIFAR-10 where we significantly outperform greedy gradient-based alternatives, while achieving $\times 20$ speedups compared to the state-of-the-art black-box methods. Code is available at: \url{https://github.com/polo5/FDS}

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes