LGFeb 13, 2022

Generalized Tangent Kernel: A Unified Geometric Foundation for Natural Gradient and Standard Gradient

arXiv:2202.06232v43 citations
AI Analysis

This work provides a foundational geometric framework for gradient-based optimization in machine learning, potentially impacting theoretical understanding and algorithm design, though it is incremental in extending existing kernel-based theories.

The paper addresses the theoretical gap in the existence of natural gradients on function spaces by introducing the Generalized Tangent Kernel (GTK), which unifies natural and standard gradients under a Riemannian metric framework, showing that standard gradients can capture intrinsic geometric structure as effectively as natural gradients for fixed parameterizations.

Natural gradients have been widely studied from both theoretical and empirical perspectives, and it is commonly believed that natural gradients have advantages over standard (Euclidean) gradients in capturing the intrinsic geometric structure of the underlying function space and being invariant under reparameterization. However, for function optimization, a fundamental theoretical issue regarding the existence of natural gradients on the function space remains underexplored. We address this issue by providing a geometric perspective and mathematical framework for studying both natural gradient and standard gradient that is more complete than existing studies. The key tool that unifies natural gradient and standard gradient is a generalized form of the Neural Tangent Kernel (NTK), which we name the Generalized Tangent Kernel (GTK). Using a novel orthonormality property of GTK, we show that for a fixed parameterization, GTK determines a Riemannian metric on the entire function space which makes the standard gradient as "natural" as the natural gradient in capturing the intrinsic structure of the parameterized function space. Many aspects of this approach relate to RKHS theory. For the practical side of this theory paper, we showcase that our framework motivates new solutions to the non-immersion/degenerate case of natural gradient and leads to new families of natural/standard gradient descent methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes