LGMLFeb 24, 2020

Reparameterizing Mirror Descent as Gradient Descent

arXiv:2002.10487v210 citations
AI Analysis

This work provides a versatile framework for applying mirror descent, which can be more efficient for sparse targets in small networks, but it is incremental as it builds on existing optimization methods without broad SOTA impact.

The authors tackled the problem of making mirror descent updates more accessible by reparameterizing them as gradient descent updates, enabling their implementation via standard backpropagation in some cases, though they note that ensuring discrete versions closely track continuous ones remains an open problem.

Most of the recent successful applications of neural networks have been based on training with gradient descent updates. However, for some small networks, other mirror descent updates learn provably more efficiently when the target is sparse. We present a general framework for casting a mirror descent update as a gradient descent update on a different set of parameters. In some cases, the mirror descent reparameterization can be described as training a modified network with standard backpropagation. The reparameterization framework is versatile and covers a wide range of mirror descent updates, even cases where the domain is constrained. Our construction for the reparameterization argument is done for the continuous versions of the updates. Finding general criteria for the discrete versions to closely track their continuous counterparts remains an interesting open problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes