Natural Gradient for Combined Loss Using Wavelets
This work addresses a specific optimization problem for machine learning practitioners, but it appears incremental as it builds on existing natural gradient methods.
The paper tackles the optimization of a convex linear combination of different loss functionals by proposing a new natural gradient algorithm that uses compactly supported wavelets to approximate the Hessian diagonalization, with numerical results demonstrating its efficiency.
Natural gradients have been widely used in optimization of loss functionals over probability space, with important examples such as Fisher-Rao gradient descent for Kullback-Leibler divergence, Wasserstein gradient descent for transport-related functionals, and Mahalanobis gradient descent for quadratic loss functionals. This note considers the situation in which the loss is a convex linear combination of these examples. We propose a new natural gradient algorithm by utilizing compactly supported wavelets to diagonalize approximately the Hessian of the combined loss. Numerical results are included to demonstrate the efficiency of the proposed algorithm.