OCLGFeb 12, 2020

Average-case Acceleration Through Spectral Density Estimation

arXiv:2002.04756v714 citations
AI Analysis

This work addresses optimization challenges in machine learning by providing average-case accelerated methods that do not require knowledge of the Hessian's smallest singular value, offering improvements over existing worst-case approaches in specific scenarios.

The paper tackles the problem of optimizing random quadratic problems by developing a framework for average-case analysis, which yields momentum-based algorithms that achieve acceleration based on the Hessian's eigenvalue distribution. The proposed methods outperform classical worst-case accelerated methods in certain regimes, as shown in empirical benchmarks on quadratic and logistic regression problems.

We develop a framework for the average-case analysis of random quadratic problems and derive algorithms that are optimal under this analysis. This yields a new class of methods that achieve acceleration given a model of the Hessian's eigenvalue distribution. We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions. These methods are momentum-based algorithms, whose hyper-parameters can be estimated without knowledge of the Hessian's smallest singular value, in contrast with classical accelerated methods like Nesterov acceleration and Polyak momentum. Through empirical benchmarks on quadratic and logistic regression problems, we identify regimes in which the the proposed methods improve over classical (worst-case) accelerated methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes