LGJun 13, 2017

Accelerated Dual Learning by Homotopic Initialization

arXiv:1706.03958v1
Originality Incremental advance
AI Analysis

This work addresses optimization efficiency for machine learning practitioners, particularly in deep learning, but it is incremental as it builds on known gradient and coordinate descent methods.

The paper tackles the problem of slow initial progress in dual optimization by showing that bounded correlations between eigenfeatures and the response can accelerate early convergence, and it proposes provable initialization strategies for dual problems and heuristics for non-convex cases, with experimental validation.

Gradient descent and coordinate descent are well understood in terms of their asymptotic behavior, but less so in a transient regime often used for approximations in machine learning. We investigate how proper initialization can have a profound effect on finding near-optimal solutions quickly. We show that a certain property of a data set, namely the boundedness of the correlations between eigenfeatures and the response variable, can lead to faster initial progress than expected by commonplace analysis. Convex optimization problems can tacitly benefit from that, but this automatism does not apply to their dual formulation. We analyze this phenomenon and devise provably good initialization strategies for dual optimization as well as heuristics for the non-convex case, relevant for deep learning. We find our predictions and methods to be experimentally well-supported.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes