LGMLOct 1, 2018

Riemannian Adaptive Optimization Methods

arXiv:1810.00760v2342 citations
Originality Incremental advance
AI Analysis

This work addresses a gap in optimization for machine learning on non-Euclidean spaces, offering incremental improvements for tasks like hyperbolic embeddings.

The authors tackled the problem of generalizing adaptive optimization methods like Adam and Adagrad to Riemannian manifolds, providing algorithms and convergence proofs for geodesically convex objectives on product manifolds, and experimentally showed faster convergence and lower train loss in embedding the WordNet taxonomy in the Poincare ball.

Several first order stochastic optimization methods commonly used in the Euclidean domain such as stochastic gradient descent (SGD), accelerated gradient descent or variance reduced methods have already been adapted to certain Riemannian settings. However, some of the most popular of these optimization tools - namely Adam , Adagrad and the more recent Amsgrad - remain to be generalized to Riemannian manifolds. We discuss the difficulty of generalizing such adaptive schemes to the most agnostic Riemannian setting, and then provide algorithms and convergence proofs for geodesically convex objectives in the particular case of a product of Riemannian manifolds, in which adaptivity is implemented across manifolds in the cartesian product. Our generalization is tight in the sense that choosing the Euclidean space as Riemannian manifold yields the same algorithms and regret bounds as those that were already known for the standard algorithms. Experimentally, we show faster convergence and to a lower train loss value for Riemannian adaptive methods over their corresponding baselines on the realistic task of embedding the WordNet taxonomy in the Poincare ball.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes